Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share performance measurement with IDP #352

Open
yi-gu opened this issue Sep 26, 2022 · 3 comments
Open

Share performance measurement with IDP #352

yi-gu opened this issue Sep 26, 2022 · 3 comments

Comments

@yi-gu
Copy link
Collaborator

yi-gu commented Sep 26, 2022

Motivation

A successful web API must perform to the level required by the developers that use it. Sharing metrics with the developer to assist them monitoring how the API performs is beneficial to developers, users and FedCM API itself. e.g. an identity provider (IDP) can debug relying party (RP) specific deployment issues or monitor timing related measurements to fix bottlenecks such that their users could have more smooth federation experience. IDP developers can also provide feedback based on the metrics to improve FedCM API.

To achieve this goal at scale, we propose to send data to IDP via a new endpoint in the config file (optional unless IDP wants to receive measurement).

Proposal

Server

If an IDP wants to receive the measurement, it should specify a new endpoint metrics_endpoint in the config file. e.g.

{
  "metrics_endpoint": "/metrics.php"
}

Similar to other endpoints, it must be same-origin with the IDP.

Client

Depending on whether the API call is successful (user grants permission via the "Continue as" button), we can share different data with the IDP. For privacy reasons, any data sent to IDP must be uncredentialed without user information.

API Succeeded

When the API call succeeded, i.e. IDP has issued an id token to the RP upon user’s approval, we can send the following timing information to IDP:

  • the time from when a call to the API was made to when the accounts dialog is shown.
    • the time from when a call to the API was made to when the account request was sent
    • the time from when the account request was sent to when the account response was received
    • the time from when the client metadata request was sent to when the client metadata response was received
  • the time from when the accounts dialog is shown to when the user presses the Continue button.
  • the time from when the user presses the Continue button to when the id token response is received.
  • the overall time from when the API is called to when the id token response is received.

A sample request to the metrics endpoint in the API success case looks like:

POST /metrics_endpoint HTTP/1.1
Host: idp.example
Origin: https://rp.example/
Content-Type: application/json
Sec-Fetch-Dest: webidentity

{
    "body": {
      "clientId": "client123",
      "timing": {
        "time_to_show_ui": 3000,
        "time_to_receive_configuration": 1000,
        "time_to_receive_accounts": 1000,
        "time_to_receive_client_metadata": 1000,
        "time_to_continue_on_ui": 3000,
        "time_to_receive_token": 2000,
        "turnaround_time": 8000,
      },
    },
    "url": "https://rp.example/",
}

Note: there could be a survivorship bias. e.g. higher latency may lead to lower reporting rate.

API failed

For privacy reasons, we cannot inform the RP “when the API call fails and why” in most cases via DOM exceptions. Thus, it’s hard for developers to understand the performance of the API without knowing the API call status.

One important reason why we cannot reject the promise immediately is that the API is called on the RP site and we do not want to leak user or IDP information to RP. However, we could communicate with IDP directly with failure information depending on the failure type.

  • If the API call failed due to IDP configuration instead of user action, we can expose the reason for failure via the metrics endpoint immediately. e.g. the endpoints are misconfigured, the response is malformed etc.

  • If the API call failed due to RPs (multiple in-flight requests, aborting etc.) currently we reject the promise immediately. With the SDK, IDP is capable of learning that information already. That said, it’s possible that the RP can call the API directly without an IDP SDK and in that case we should not expose RP failure details to IDP. If IDP SDK is used, RP failures can be detected by IDP already since we reject the promise immediately (e.g. request is aborted, multiple in-flight requests etc.).

  • If the API call failed due to users (declined / ignored the permission, disabled FedCM, not signed in to IDP etc.), we should expose a generic failure code instead of the exact reason for failure because IDP can forward anything we share with them to RP without user permission.

Failure types

Type Error Code Error Message Description
IDP 101 Unavailable server: 404 HTTP 404
IDP 102 Invalid response to the well-known file request Invalid WellKnown Response
IDP 103 Invalid response to the config file request Invalid Config Response
IDP 104 Invalid response to the accounts request Invalid Accounts Response
IDP 105 Invalid response to the client metadata request Invalid ClientMetadata Response
IDP 106 Invalid response to the token request Invalid Token Response
RP 201 RP related failure Multiple Inflight Requests OR Request Aborted
User 301 User related failure User Dismissed UI OR IgnoredUI OR DisabledFedCM OR InCooldown

Note

  • The list may grow as we develop new features with new failure types
  • Some failures may become obsolete. e.g. we may support multiple in-flight requests in the future
  • Some failures (e.g. 102 -104) are IDP specific but exposing RP in the report could be beneficial too. e.g. if an IDP sees an abnormal spike of Invalid Accounts Response for a specific RP, they can further debug what could be the cause (e.g. CSP issue, geo based network issue).

A sample request to the metrics endpoint looks like:

POST /metrics_endpoint HTTP/1.1
Host: idp.example
Origin: https://rp.example/
Content-Type: application/json
Sec-Fetch-Dest: webidentity

{
    "body": {
      "clientId": "client123",
      "errorId": 106,
      "message": "Invalid response for the token request",
    },
    "url": "https://rp.example/",
}

Forwards Compatibility Consideration

As noted in the blog post, FedCM API is under active development and some of the new features may have impact on the metrics endpoint as well. Notably:

  • IdpSigninStatus API
    browser can know about whether a user is signed in to an IDP
    if a user is not signed in to an IDP, in the single IDP case, it’s OK to send a generic UserFailure to the IDP. Once we support - multiple IDPs, we should be consistent about whether to send the failure to such IDP regardless if it's used as a single IDP or as part of multiple IDPs. See below for more discussions.
  • Multiple IDP support
    browser can support multiple IDPs in the same federation flow

Once we support multiple IDP there will be some new challenges:

  • From a privacy's perspective, there’s a new party in the flow, e.g. {user, RP, IDP1, IDP2}, so we need to make sure that information related to IDP1 doesn’t get exposed to IDP2. e.g. if the user chose IDP1 on the FedCM UI, IDP2 should NOT be notified with “user chose another IDP”. Otherwise IDP2 can infer that the user had a session with IDP1.
  • From scalability’s perspective, if there were 1000 IDPs specified in the API, it would be suboptimal to send an error to 999 IDPs if the user chose one of them. To mitigate this problem, we can choose to only send data to the IDP that the user is currently signing in with.
    • Ideally, we should be consistent with the strategy to avoid regressing IDP’s dashboard. e.g. if a user is not signed in to an IDP today, we send errors to the IDP; once we support multiple IDP, if we no longer send errors to the IDP, their dashboard may be impacted.
    • Unfortunately, today the only way the browser can tell whether a user is signed in to an IDP is by checking the accounts response. If we use that signal to determine whether we send errors to the IDP, it would be too late and useless.
    • Luckily, with the IdpSigninStatus API, the browser can tell from an early phase and only send all types of errors to the IDPs which the user is signed in with. This would indeed have an impact on IDPs if they implement the metrics endpoint today.
    • Alternatively we could allow IDPs to subscribe to the metrics regardless by introducing a new parameter in the config file: metrics_subscription: [always|signin]. “always” means that the browser will send metrics to the IDP regardless of the IdpSigninStatus while “signin” means that the browser only sends metrics if the user is signed in to the IDP.

Privacy Consideration

The new endpoint is for IDPs to receive API call status. Thus, it should follow the same principal of FedCM API:

  • IDP cannot learn about which RP a certain user is visiting before user permission
  • RP cannot learn about whether a certain user is signed in with a given IDP before user permission

In this proposal, all data shared with IDP is uncredentialed without user cookies so IDP cannot correlate the RP centric measurement with users. (note: there’s a known timing attack problem that’s orthogonal to this proposal and once IdpSigninStatus API is launched, the timing attack problem can be mitigated).

Security Consideration

The new endpoint follows the same standard as other FedCM endpoints do. e.g. it has a Sec-Fetch-Dest: webidentity header with Origin instead of Referer.

In addition, we use uncredentialed POST to the IDP endpoint which could technically be abused. e.g. an attacker can use curl out of any browser client to POST data to the endpoint. Similar to the Reporting API, typically it’s up to the server to build mechanisms to mitigate the potential problem.

Considered Alternatives

Reporting API

The Reporting API allows browsers to send reports created based on various platform features (e.g. document policy violation, CSP violation etc.) to web developers to help them with fixing bugs or improving their websites. See a sample report below for document policy violation:

[
  {
    "age": 420,
    "body": {
      "columnNumber": 12,
      "disposition": "enforce",
      "lineNumber": 11,
      "message": "Document policy violation: document-write is not allowed in this document.",
      "policyId": "document-write",
      "sourceFile": "https://site.example/script.js"
    },
    "type": "document-policy-violation",
    "url": "https://site.example/",
    "user_agent": "Mozilla/5.0... Chrome/92.0.4504.0"
  },
]

The Reporting API should be able to achieve the same goal. That being said, a dedicated API should be a better choice because:

  • From an ergonomics's perspective, it's easier for IDPs to implement since it's very similar to other endpoints. For developers who don't support Reporting API today, they need to build a complete new infra to receive reports. Even if an IDP supports Reporting API, it’s not surprising that a different team other than the team who uses FedCM API (e.g. a core team with all analytics) maintains the reporting infra. Therefore it could add more friction to developers.
  • The Reporting API is not fully interoperable at the moment. Having dependency on it may affect FedCM adoption eventually.

That being said, we chose to use a format that's compatible with the Reporting API in case that we want to switch to it in the future.

@samuelgoto samuelgoto added the agenda+ Regular CG meeting agenda items label Nov 3, 2022
@yi-gu
Copy link
Collaborator Author

yi-gu commented Dec 5, 2022

We have updated the proposal with details. PTAL.
@bvandersloot-mozilla FYI

@samuelgoto samuelgoto removed the agenda+ Regular CG meeting agenda items label Mar 29, 2023
@wseltzer wseltzer added the FPWD label Aug 6, 2024
@cbiesinger
Copy link
Collaborator

OK, to keep y'all updated on the current status:
We have a prototype in Chrome behind the chrome://flags/#fedcm-metrics-endpoint flag

If the fedcm request was successful, we send a credentialed request that includes the RP origin and these timings:

     "time_to_show_ui=%d"
     "&time_to_continue=%d"
     "&time_to_receive_token=%d"
     "&turnaround_time=%d"

In the failure case, we send an uncredentialed request that does not contain the RP origin and only the "error_code=%d" data. The error code is one of:

    kOther = 1,
    // Errors triggered by how RP calls FedCM API.
    kRpFailure = 100,
    // User Failures.
    kUserFailure = 200,
    // Generic IDP Failures.
    kIdpServerInvalidResponse = 300,
    kIdpServerUnavailable = 301,
    kManifestError = 302,
    // Specific IDP Failures.
    kAccountsEndpointInvalidResponse = 401,
    kTokenEndpointInvalidResponse = 402,

The metrics endpoint is specified using a metrics_endpoint field in the configURL.

Things I would love to have feedback on:

  • Would IDPs find it beneficial if the format of the reports matches the reporting API (a JSON request body)? As implied above the requests currently use a application/x-www-form-urlencoded format.
  • We are aware that some IDPs would like to calculate clickthrough rates and therefore would like to have reliable data on whether UI was shown (especially relevant for the failure case). We are discussing with privacy teams on whether we can include this information.
  • Is any information missing from the above that you would like to see included?
  • The metrics request (even in the success case) needs to happen after the ID assertion endpoint request so that it can include timing information from that request. But this means that various pieces of information that the ID assertion endpoint has are not available to the metrics endpoint. Is there any such information that IDPs would like to have included in the metrics report (for the success case) that is also sent to the ID assertion endpoint?

@cbiesinger
Copy link
Collaborator

For that last point, it would be possible to send the timing data in the ID assertion endpoint request itself, but of course then we couldn't include the timing for the ID assertion request itself. If you have thoughts on this tradeoff, please comment :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants