Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add permissions proposal #689

Closed
wants to merge 6 commits into from
Closed

Conversation

joshmarinacci
Copy link

This is a pull request to address when permissions are requested and how. Addresses #424

@NellWaliczek
Copy link
Member

Thank for putting this together, Josh! Ping to @johnpallett for feedback.

@cwilso cwilso added the agenda Request discussion in the next telecon/FTF label Jun 12, 2019
Copy link
Contributor

@johnpallett johnpallett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joshmarinacci for posting this! I've added a question and a comment.

designdocs/permissions.md Show resolved Hide resolved
There are most likely more concerns than just what I have listed above.

### Solution
To address these I propose the following: *do both*. Let the developers have a single code path where they /request permissions at application start/ *and* when /the permission is actually needed/. The first request is considered the upper bounds of possible permissions. The user agent can decide whether to actually show user request dialogs at app time or when the permission is actually used, or some other scheme that the user agent deems to be better (such as warding off new attacks in the future).
Copy link
Contributor

@johnpallett johnpallett Jun 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joshmarinacci - during our previous conversation you'd suggested a more granular set of feature requests which I took to mean something like:

Required - needed for the session to run at all.
Optional - not blocking for the session to run, but desirable.
Deferrable - not required at session creation, but might be required depending upon what the user does

note: I made up those names because I couldn't remember exactly what you said, but some of this is covered in #424

The intent as I understood it was to give guidance to the user agent on the timing of when user consent events would occur, and whether or not a session should be created if the user said 'no' to certain features.

Do you want to add that detail here? Can a user agent make the decisions outlined in this paragraph without this type of information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I made up those names

About terminology, have you considered using a term other than "permissions" since this is somewhat different from existing web permissions? Please ignore if you've discussed this already. Elsewhere we've used terms such as "consent" to distinguish it, especially if this is likely to be a per-session prompt instead of permanently granting access to a site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting... that's not the interpretation I took... I thought by granular we meant something like "spatial-tracking", "real-world-understanding", "camera-access", "eye tracking", etc. The big question there is what are the buckets we should use? That said, if we do take the modularization approach, the good news is that we'd only need to define/ bikeshed the one that would cover XRPose and XRViewerPose data for now, yeah? Then, as we add other modules we can define the appropriate enum values.

The other interesting thing about this approach is that we have the opportunity to possibly integrate it with https://www.w3.org/TR/permissions/#enumdef-permissionname so that developers can request non-XR things up front as well! That would address the concerns about developers wanting to prompt for permissions for things like Microphone mid-immersive session :)

That said... I've been consistently getting the impression that folks from Google have specific concerns about the Permissions API given the number of times y'all have mentioned wanting to avoid the word "permissions" in favor of "user consent". Is there some background or context that I'm missing that makes it contentious?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, rereading @johnpallet's privacy doc, I'm realizing it's probably more like "spatial-tracking" and "spatial-tracking-unbounded" that would be needed for the core WebXR spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said... I've been consistently getting the impression that folks from Google have specific concerns about the Permissions API given the number of times y'all have mentioned wanting to avoid the word "permissions" in favor of "user consent". Is there some background or context that I'm missing that makes it contentious?

Nell, the privacy explainer talks about some of the concerns and distinctions in the permissions section,

I wouldn't necessarily call it contentious, but I think it's helpful to make a distinction between the more abstract concept of "how do we get informed user consent for using these features" from a specific implementation method as in "let's use the browser's existing Permissions API support for this". While the permissions API spec is fairly general, the practical implementations in web browsers tend to be based around persistently saved per-site choices, and as the privacy doc explains this isn't necessarily a good fit to some of the AR/VR use cases. I think it's better to avoid the word "permissions" at least during the design phase to avoid conflating these two interpretations.

Personally I'm in favor of using the Permissions API where it makes sense, but AFAIK it hasn't commonly been used for use cases such as per-session temporary permissions. That seems compatible with the spec by slightly bending the "new information about the user’s intent" rule as applying when a session ends, but I think that's not what people would typically have in mind when talking about web permissions, and it may need additional implementation work in browsers to generalize internal permission handling to make this work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, https://w3c.github.io/encrypted-media/#dom-mediakeysrequirement is a similar tristate requests for specific properties and https://w3c.github.io/encrypted-media/#navigator-extension-requestmediakeysystemaccess works to satisfy the various requested properties. This API was also intended to allow a single consent request that covered various properties and possible configurations. Something like that might be useful here, though it gets complicated.

I made a comment above similar to @klausw's. Please also see https://github.com/immersive-web/privacy-and-security/blob/master/EXPLAINER.md#augmented-reality-mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't await of the Permissions spec until now. Interesting. Would it's query() function violate our desire to avoid fingerprinting? As long as query() is called after the initial session request would it still be okay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddorwin Thanks for the pointer to the MediaKeysRequirement stuff! time to go fall down the rabbit hole of spelunking through a new spec!

Copy link
Member

@NellWaliczek NellWaliczek Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshmarinacci That's basically the question I'm asking. If it's ok for Navigator, would it be ok if there was a XRSession.Permissions.query()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the privacy design doc the term 'permission' was intentionally avoided in part to avoid inheriting the algorithms and requirements of the Permission API, and also because 'permissions' are an established concept in browsers. It's still not clear to me whether using the term 'permissions' leaves sufficient room for user consent flows such as the screenshot from @blairmacintyre (a dialog with multiple toggles), or user consent that is valid only for the duration of a session (vs. causing a more persistent state change). For those reasons the privacy design doc instead uses the term 'user consent' to describe the requirement.

@joshmarinacci my original question got a bit lost in this thread. Did you intend to have 'required', 'optional' and 'maybe later' as three different options when requesting features?


Note that if the application did not include MICROPHONE as the initial list of permissions to `requestSession` then the user agent *must* reject any later requests to `getUserMedia()`.

*The initial list of permissions is the maximum set of permissions the application may request throughout it’s lifetime.* This enables, say, a vr chat client to let the user enter a move and talk only mode and later request camera access because the user wants to share something, without having to bug the user initially; if the user agent supports immersive permission requests. On user agents which do not support immersive permission requests, for whatever reason (form factor, user prefs, domain name, phase of moon, etc.), then the request for camera can be done at application start.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"lifetime" is an important concept here - do you want to specifically say "duration of the session"?

If I understand it correctly, the application would be free to end the session and start a new one with a different set of desired permissions, while still preserving JS and graphics state. That has its own issues, i.e. a desktop VR setup may not offer an easy way to resume the session, but I think it's a potential alternative for developers. An app could have distinct modes where some features are only needed in specific circumstances, for example a VR sculpting app with optional copresence support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is totally a key thing we need to nail down.. From what I'm gathering there are 3 possibilities under consideration for the lifetime of a permission:

  • Session
  • Browsing context
  • Origin

The big question I have is... can this be something left to the user agent to decide?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these permissions are requested when obtaining the session, I think they should be for the lifetime of the session.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The privacy design doc suggests that for a given origin, user consent for a set of feature should last as long as the browsing context. This means the same origin doesn't have to repeatedly ask for user consent for the same set of features during the same browsing session. However, if the user closes the browser and restarts, then user consent would be required.

Copy link
Member

@toji toji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up, Josh! Added some initial comments, based on concerns that have surfaced in the past when discussing similar approaches.

designdocs/permissions.md Show resolved Hide resolved
})
```

At this point the user agent can either prompt the user for the microphone permission, or immediately grant it if the user already approved it previously.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One caveat to this is that the timing of the application matters, and that cannot be known by the user agent. For example: Let's assume I have two applications that both use some plane geometry and camera access. AppA is an app where you place virtual furniture, and can get away with only having plane access until the user says they want to save a picture. AppB is tour guide app and both plane access and camera access are absolutely essential to it's function.

If we assume that the UA is capable of doing just-in-time prompts, AppA works out pretty well, in that the app starts up and asks "Planes, please?" Then waits until the user clicks the "Save image" button at which point the user is asked "Camera access: Cool, or nah?" User understands the context, clicks "Yes", everyone is happy.

However, with AppB the user first immediately sees a prompt saying "Planes!" then, upon clicking yes or no gets an immediate followup prompt says "Camera!" and now we're just pissing them off by making them click through multiple layers of prompts because the UA wanted to do the "right thing" of deferring permission till the time it was needed.

As such, we probably want the developer to be able to specify a list of permissions that MUST be requested prior to session creation, as well as permissions they MAY use later (which could then be requested up front based on UA capabilities if needed.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny.. This exact thing was something I used to be pretty worried about... But this is how the rest of the web works today, yeah? So I'm kinda less worried? A couple things to consider:

  • Even if we manage to do something like you've described for XR related permissions, the same problem can occur for non-XR permission prompts while in an immersive session. Who's to say a developer won't ask for microphone and the immediately ask for geolocation?
  • Alternatively, given that promises aren't resolved immediately, perhaps user agents have the option to be clever and also delay the display of a permission prompt until after control has returned to the user agent? In which case a developer invoking two permissions requests in the same function flow would cause that consolidation to occur, yes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same thing @johnpallett was asking about above. A way to specify if a particular permission is must vs may (required vs optional)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually asked whether there should be three options for feature requests: "required", "optional" and "later, not now" (the third one I don't have the right name for). @joshmarinacci this was based upon what I thought you described during the f2f.

The idea behind these three options as I understood them were:
Required: I need this. Otherwise, fail to create the session.
Optional: I don't necessarily need this, but please get consent now so I know what I have.
Later, Not Now: I don't necessarily need this, and I don't need to know now. You can ask later, or now, either works for me.

That way the user agent has a sense of when to ask for user consent: Required at session creation, Optional at session creation (if the platform supports the feature), 'Later not now' either at session creation, or when the feature is actually used, the choice being at the discretion of the user agent.


### Advantages
This system has two advantages over other approaches.
* the developer has a single code path that will always be followed. They will not need to write conditional code to work around different user agent authentication systems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developers will likely want a method for determining if permission for a feature has already been granted prior to calling a potentially-permission-producing method. This is to allow for the common pattern of displaying a quick in-app explanation regarding why the user is about to see a dialog just prior to displaying the platform-level prompt. If the permission has already been granted at session start, however, they would end up showing a message saying "Hey! You're about to see something that looks like this, and here's why you should click 'Yes'" followed by nothing, which appear broken to the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, i meant to put that other comment here... moving it....

If we do take the approach of integrating with https://www.w3.org/TR/permissions/#enumdef-permissionname then we can also take advantage of the https://www.w3.org/TR/permissions/#dom-permissions-query API? Which to be honest, is interesting to me that this is queryable since for some reason I thought there were concerns about websites being able to differentiate between a user rejecting a permission and some other random reason a feature might not be available? 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I thought this was a fingerprinting issue? If not then yes, let's have a query.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a decent way to use the query API as a fingerprint, but I'm happy to be corrected if someone else can. I'm tentatively in favor of allowing the query (because it will enable more progressive permissions--yay!)

@NellWaliczek
Copy link
Member

From what I can tell we have some key questions to resolve (some of which are in regards to integration with the Permissions Spec):

  1. Is there any issue with allowing each UA to decide for itself when to display a request for user consent?
  2. Is there a specific aspect of the privacy spec that prevents each UA from deciding for itself the appropriate visualization of gathering user consent?
  3. Is there any issue with allowing each UA to decide for itself the appropriate duration of user consent?
  4. Is there a specific aspect of the privacy spec that prevents each UA from deciding the appropriate duration of user consent?
  5. Is there any concern with allowing developers to query the state of user consent?
  6. Is there any concern with the mechanism of querying user consent being through the Permissions.query() API?
  7. If so, what are the permissions we should be enumerating? What is the scope (ex. "things that impact poses from local or bounded reference spaces" is a strawman we can pick apart) and then what should we call the enum values ("spatial-tracking-unbounded"? or?)

Are there other questions I'm missing?



### Philosophy
The user-agent will always be closer to the user than a spec. The user agent has greater access to current conditions than we (the spec writers) ever will. As a philosophy we should give the user agent *as much freedom as possible to “do the right thing”* under whatever circumstances are at the time a web application is started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be interpreted as meaning that we don't need privacy and security guidelines or related normative text.

I think it's possible for such text to be written in a way that allows applications can "do the right thing."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying this statement gives the user agent too much freedom? How about "As a philosophy we should give the user agent as much freedom as possible to do the right thing under whatever circumstances are at the time a web application is started while at the same time allowing applications to create a smooth workflow for their users"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it implies that such freedom is the priority over security and privacy. Specifically, "as much freedom as possible" could lead to this interpretation, but there should probably also be some explicit reference to being within the privacy and security guidelines/requirements.

nit: Although it's used in a reasonable way in isolation, the word "whatever" contributes to the laissez-faire tone of the statement. This can probably be fixed with: s/whatever circumstances are/the circumstances/


### stating concerns

Concerns about granting at page load time are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "page load" is the right timeframe/term. Maybe something like "when starting the experience."

Such a term can cover explicit request when the user clicks the "Enter VR", requests resulting from requestSession(), and (in the future) seamless immersive navigation scenarios. It might even be worth mentioning those - or breaking them up if there are differences.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change references to 'page load time' to 'request immersive session time'. Would that work for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, since we'd like it to be possible for a page to request a "local" or "local-floor" reference space for an inline session, this will need to apply more generically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the appropriate language here?

designdocs/permissions.md Show resolved Hide resolved
designdocs/permissions.md Outdated Show resolved Hide resolved
Concerns about granting requests once already inside of immersive mode are:
* access requests when inside immersive mode should be done with some sort of a secure method, such as a dedicated hardware button, so that applications cannot spoof such permissions.
* some devices or scenarios may not be able to provide such a secure method, thus doing the request before entering immersive mode is preferable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another critical concern is that this can change the privacy properties, which the user agent may have carefully explained to the user when creating the session. This could lead to user confusion, frustration, etc.

As an example, if the "AR Mode" entry interstitial says "the application has access" then the application is allowed to request more access - the user may be confused or not understand the implications of combining those levels of access. Browser permission UI is not generally sufficient to explain such interactions, so the initial interstitial is a unique opportunity.

/cc @johnpallett @NellWaliczek

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to read between the lines here and failing :-/

Are you saying that user agents MUST NOT request user consent both at the beginning and during a session? Even if the consent being requested mid-session is unrelated to the consent granted at the beginning?

I think the root of my confusion is that I'm struggling to differentiate between you're saying Chrome is planning to do vs. what y'all are saying must be requirements on all user agents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to be clear, I'm not taking a stance either way! I'm just trying to ensure I'm understanding your stance accurately)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, my comments are general about what the spec (and this doc) say, which are related to general guidance and requirements that apply to all user agents.

I'm just raising the concern so that it can be considered and addressed. My comment was not intended to imply any requirements (i.e., "MUST NOT request..."). While that is one possible solution, there may be others.

FYI, I filed #702 for related issues.

There are most likely more concerns than just what I have listed above.

### Solution
To address these I propose the following: *do both*. Let the developers have a single code path where they /request permissions at application start/ *and* when /the permission is actually needed/. The first request is considered the upper bounds of possible permissions. The user agent can decide whether to actually show user request dialogs at app time or when the permission is actually used, or some other scheme that the user agent deems to be better (such as warding off new attacks in the future).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, https://w3c.github.io/encrypted-media/#dom-mediakeysrequirement is a similar tristate requests for specific properties and https://w3c.github.io/encrypted-media/#navigator-extension-requestmediakeysystemaccess works to satisfy the various requested properties. This API was also intended to allow a single consent request that covered various properties and possible configurations. Something like that might be useful here, though it gets complicated.

I made a comment above similar to @klausw's. Please also see https://github.com/immersive-web/privacy-and-security/blob/master/EXPLAINER.md#augmented-reality-mode.

* the user agents know more about the user, the current scenario, and can make better decisions *at the time of use* than a spec can. Neither method give the user agent freedom to adapt.
* there may be new attacks devised or new flaws discovered after we have shipped, but user agents will not be able to address these because they are constrained by the spec.

There are most likely more concerns than just what I have listed above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One goal of session-based consent (for non-inline sessions) is to make the scope of that consent clear to the user and to avoid persisting consent (i.e., "permissions" - see @klausw's comment below) that could be used in future non-WebXR visits to that origin. Both of these were driving factors behind "AR Mode" - see https://github.com/immersive-web/privacy-and-security/blob/master/EXPLAINER.md#augmented-reality-mode

Copy link
Member

@NellWaliczek NellWaliczek Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, and I did read that when it was originally designed. But again, I'm confused about if "Instead of permissions, an alternative approach could be..." is intended as a requirement for all user agents or just an option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That repository's readme says, "The purpose of this repo is to explore those threat vectors and possible mitigations that may form the basis of the Privacy and Security Considerations sections for APIs related to the immersive web, as well as informing normative requirements for those APIs." The exact requirements for WebXR or AR modes are still TBD in the specifications in this and other repositories.

I believe that the AR Mode concept is critical to the experience and developers knowing what to expect, so there would likely be normative language about it for all user agents. As a simple example, the fact that consent is session-based, which means users may need to consent each time an AR session is requested. The API design could still allow for configuration as discussed elsewhere in this PR.


At this point the user agent can either prompt the user for the microphone permission, or immediately grant it if the user already approved it previously.

Note that if the application did not include MICROPHONE as the initial list of permissions to `requestSession` then the user agent *must* reject any later requests to `getUserMedia()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starts to affect other APIs and the permission infrastructure of implementations. That would probably require greater socialization and review. (The permissions experience on the web could definitely be better, and there are probably other efforts looking at that, but it's beyond the scope of WebXR.)

* there may be new attacks devised or new flaws discovered after we have shipped, but user agents will not be able to address these because they are constrained by the spec.

There are most likely more concerns than just what I have listed above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another goal is that developers should generally know what to expect across user agents (i.e., what may cause a prompt and when) and that the user experience should be relatively consistent across user agents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there agreement on this point? From what I can tell, some UAs want to do just-in-time permissions but others want them upfront. As long as a single code path exists that will hit both, is there a specific reason UAs should be forced to conform?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this proposal is to give UAs freedom while giving developers only a single path to code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the specification needs to make it clear to developers when there might be consent prompts so they can avoid patterns that may cause undesirable results. It would also help with compatibility if there are fewer ways things can vary across implementations.

@avadacatavra
Copy link

avadacatavra commented Jun 14, 2019

In order to hit VR Complete status by the end of June, this should land no later than next Friday. So far, the discussion here and in the privacy repo has been lengthy and in depth (as it should be), but IMO the level of detail has distracted us from surfacing what we, as a WG, need to find consensus on---a consensus that will be informed by the details. I think that this PR makes a good step towards surfacing these questions. As I understand it, there are three key questions we must resolve:

  1. What is the lifetime of user-granted consent
  • session
  • browsing context
  • origin
  1. When do we prompt for consent?
  • session creation
  • time of use
  • some hybrid model?
  • as Nell points out, can we create a path that will allow UAs to do either?
  1. What granularity of permissions/consent do we request?
  • Required/optional/deferred
  • is this also concerned with bundling?
    • how would we bundle permissions? shared threat vectors?
    • because XR uses so many sensors, I think there's a good argument for bundling to avoid requesting permissions regarding sensors that are complex to explain (and will likely lead to fatigue)

Additionally, what integration should our spec have with the Permissions API?

@NellWaliczek
Copy link
Member

So in talking with Brandon he pointed out that a major concern of mine hadn't been clear to him until I said it again just now

If the premise is that prompting for permission inside immersive sessions is dangerous and some UAs will choose to do it all up front as a result... what does that mean for permission requests unrelated to XR that occur within an immersive session? For example microphone or bluetooth?

@joshmarinacci
Copy link
Author

So in talking with Brandon he pointed out that a major concern of mine hadn't been clear to him until I said it again just now

If the premise is that prompting for permission inside immersive sessions is dangerous and some UAs will choose to do it all up front as a result... what does that mean for permission requests unrelated to XR that occur within an immersive session? For example microphone or bluetooth?

Are we allowed to make recommendations for other specs? I suppose we can say that UAs may wish to join these permissions with the XR ones, but I don't think we can specify that this must be the case. Is this beyond our charter?

@avadacatavra
Copy link

@NellWaliczek that's a really good point

At Mozilla, we've definitely talked about how we can safely request user consent from an immersive session, the key problem being that there's no way to indicate that a user should trust the request (if this is related to a different reason, feel free to disregard). There are a few potential approaches:

  • use a hardware button that would cause different behavior if the prompt is spoofed
  • suspend the immersive session, prompt, then return to the session
  • create some sort of trust indicator that we can use within the immersive session to indicate that the prompt is valid
  • request all permissions at session creation

I would say that all user consent requests the occur within an immersive session must be consistent, otherwise we'll be creating an expectation of inconsistency that could defeat all of the above approaches. Ideally, all user consent requests will be consistent across UAs, but I would say that they should certainly be consistent within a UA--e.g. requesting microphone data while in immersive mode would result in the same user flow as requesting XRPose data

@ddorwin
Copy link
Contributor

ddorwin commented Jun 14, 2019

@NellWaliczek wrote:

From what I can tell we have some key questions to resolve (some of which are in regards to integration with the Permissions Spec):

  1. Is there any issue with allowing each UA to decide for itself when to display a request for user consent?

"When" can have multiple meanings here:

  • A point in time (or algorithm): There should be normatively specified points where user interaction might be required.
  • Whether: User agents are often given more leeway here, especially in the how. However, especially for very sensitive data, user agents can be required to get express consent/permission. For example, http://w3c.github.io/geolocation-api/#security, though note that even in that case there are both MUST and SHOULD statements.
  1. Is there a specific aspect of the privacy spec that prevents each UA from deciding for itself the appropriate visualization of gathering user consent?

Do you mean the privacy design doc? I'm not sure, but I'll provide a more general response: No, implementations have a lot of leeway in UX as long as the behavior is compliant with the spec, which should try to ensure consistent/interoperable observable behavior as well as avoiding surprising behavior for developers.

  1. Is there any issue with allowing each UA to decide for itself the appropriate duration of user consent?

In general, I don't think this has been the case, but I could see things becoming more prescriptive as specifications and related processes focus more on privacy issues. For example, the specification for some highly-sensitive API might specify that consent should not be persisted beyond the browsing session. The API could also be designed in such a way that user interaction is always required (i.e., a file or device picker).

It is also reasonable for specifications to set some lower boundaries on duration to ensure consistent behavior. For example, a developer may not expect each call to some API to prompt the user, so the specification might say that such consent should last for the context's lifetime. Historically, this may have been handled by the persistence of permissions, but it's good to be explicit in the specification, especially when not assuming persisted permissions. For WebXR-based consent, it seems like the duration should be normatively tied to the session lifetime in at least some cases.

  1. Is there a specific aspect of the privacy spec that prevents each UA from deciding the appropriate duration of user consent?

I'm not sure, but I hope it aligns with my answer above.

  1. Is there any concern with allowing developers to query the state of user consent?

(This is my own analysis and not representative of any general position or guidance.)

There could be concerns. One is that that the application can determine whether a user denied a permission vs. just didn't have support in their user agent or platform. If, for example, requestSession() always rejects with "NotSupportedError", it's not trivial to determine that the user denied the request vs. doesn't have hardware. (This could still be inferred via timing.)

Another is that with a little knowledge about specific implementations, applications might also be able to infer that the user has previously visited the origin even if they have cleared site data. For example, if a site always requests geolocation, the fact that the query result is not "prompt" could be a good signal that the user has previously visited the site (unless they have this disabled for all sites in the user agent's settings).

Silently querying various permissions could also provide data for fingerprinting, especially across multiple visits to the same origin even after clearing site data.

What I don't know is how much the specifications prioritize mitigating these concerns. The Permissions API would seem to explicitly enable the scenarios described above.

  1. Is there any concern with the mechanism of querying user consent being through the Permissions.query() API?

See above.

The main issue related to the use of this specific API is that the various dimensions of consent would need to be added to the https://w3c.github.io/permissions/#permission-registry and that this comes with certain implications (see my discussion of "permissions" vs. "user consent" below).

  1. If so, what are the permissions we should be enumerating?  What is the scope (ex. "things that impact poses from local or bounded reference spaces" is a strawman we can pick apart) and then what should we call the enum values ("spatial-tracking-unbounded"? or?)

I think this is an open issue that needs some dedicated thought and discussion. The privacy design details various threats and why consent is required, but further work is necessary to determine how many different dimensions of consent make sense and what they should be.

IMO, something like "spatial-tracking-unbounded" is probably too granular, which would become difficult for developers to use. IMO, it probably makes more sense to define what the user is giving up, which may not map 1:1 to a single concept in the API surface.

Are there other questions I'm missing?

You also asked about the distinction between "permissions" and "user consent." The way I view this is that "consent" is something that the user provides and there are various mechanisms for this - "permissions," clicking through an interstitial, selecting a file from a picker, etc. On the other hand, "permissions" is more specific to browser permissions, which are often persisted, and related UI (i.e., Chrome's chrome://settings/content). This is why we use the term "consent," especially for things where there isn't a clear decision to use or require a "permission."

Note also that we might want to have more dimensions of consent than there are permissions, especially in browser UI. If you look at https://w3c.github.io/permissions/#permission-registry or Chrome's chrome://settings/content UI, the "permissions" tend to be one per API or even cover multiple APIs. On the other hand, we are talking about at least several dimensions for a single API (WebXR) or even mode of that API.

Note that policy-controlled features roughly follow that same model (covering an entire or multiple APIs). We'll need to work through that when addressing #308 too.

@ddorwin
Copy link
Contributor

ddorwin commented Jun 14, 2019

@NellWaliczek wrote:

If the premise is that prompting for permission inside immersive sessions is dangerous and some UAs will choose to do it all up front as a result... what does that mean for permission requests unrelated to XR that occur within an immersive session? For example microphone or bluetooth?

#702 describes some issues related to non-XR permission-requiring APIs.

@avadacatavra
Copy link

Consent duration

@ddorwin:

Whether: User agents are often given more leeway here, especially in the how. However, especially for very sensitive data, user agents can be required to get express consent/permission. For example, http://w3c.github.io/geolocation-api/#security, though note that even in that case there are both MUST and SHOULD statements.

IMO the question of 'whether' is something that we shouldn't be asking here, focusing instead on when. For example, we don't require explicit consent for accessing ambient light sensor data, but we've also seen that used as a side channel vector. I think it's out of scope to determine 'whether,' particularly given the time constraints.

For WebXR-based consent, it seems like the duration should be normatively tied to the session lifetime in at least some cases.

Can you expand on what cases shouldn't be tied to session lifetime?

Permissions API

The other key part of this comment is the thorough discussion of the Permissions API, in particular the query mechanism. I'm going to try to surface what we as a WG need to decide to meet deadlines.

The first question is should we integrate XR permissions into the Permissions API? It seems attractive to work within this existing framework. I see a few considerations:

  • integrate with existing API (both pros and cons)
  • possibility of fingerprinting via query mechanism
    • I'm not convinced that this is a particularly dangerous/identifiable fingerprint, although in combination with other information, it might be. I'm happy to be disproved though.
  • dimensions of consent required
    • Agreed that this is an important issue. I think it might be out of scope for the current deadline though

@NellWaliczek
Copy link
Member

Per my conversation with @ddorwin and the discussion on yesterday's call, it seemed sensible to divide up the large number of related but distinct topics covered by the discussion on this PR into separate issues to make it easier to follow. fingers crossed that I've captured things correctly so far and haven't missed anything major!
Once we've gotten (most) of those issues resolved we can come back and figure out what we'd like to do with this PR.

@ddorwin
Copy link
Contributor

ddorwin commented Jun 20, 2019

Answering @avadacatavra's question:

@ddorwin:
...

For WebXR-based consent, it seems like the duration should be normatively tied to the session lifetime in at least some cases.

Can you expand on what cases shouldn't be tied to session lifetime?

I didn't have any specific cases in mind. I was just being careful to not rule anything out. It does seem likely that it will make sense to users if the scope/duration is consistent.

@NellWaliczek
Copy link
Member

Given that this thread has gotten unwieldy, shall we continue the discussion in the appropriate sub issues? :) In this case, #721

@NellWaliczek
Copy link
Member

This issue has now been addressed by #734. Per discussion with @joshmarinacci, closing this PR.

@AdaRoseCannon AdaRoseCannon removed the agenda Request discussion in the next telecon/FTF label Sep 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-and-security Issues related to privacy and security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants