-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define how to request features which require user consent #424
Comments
I like this. One question: if a feature isn't available (either not supported, or because the user declines permission), how is this exposed. When thinking about geoAlignment, I imagined requesting it (in this case, adding This means for example, an app knows immediately if a browser supports meshing, versus not just having received a mesh yet. In the former case, it can do what it needs to do to work without meshing. In the later case, it might display a floating hint asking the user to "look around" (or whatever). aside: can we add "geoAlignment: true," to these examples, since it's a pretty good candidate for this approach. There are a lot of platforms that might not support this, or ones that might support it intermittently. In particular, on a device like Hololens on ML1, with no compass, supporting geoAlignment of the coordinate system may be impossible to do transparently at the framework level since there will be no sensor for heading. But, one could imagine the platforms extending their mapping capabilities to allow some sort of offline (manual or automatic) alignment of maps with geospatial coordinates in the future; in such a case, geoAlignment might return true ("it's possible"), but a separate geoAlignment property on the session might be true or false depending on if the coordinate frames are currently aligned. I can imagine other features might have a similar need to differentiate between "feature available and allowed" and "feature currently active", beyond meshing and alignment. |
One more question - if one or more features require user consent, I'm presuming the resolution of the session creation will async wait until the user consent resolves or rejects. Are we concerned that this might expose (through timing) the difference between not supported and not allowed? |
The presumption about the promise not resolving until consent is given is correct. Good point about the timing "attack", though I'm not sure how big of a concern it is? Talking it through: Let's say the goal of a malicious app is to sniff out the presence of various hardware features even if the user doesn't want to expose them. We've already established one safety mechanism in that the only time at which they can do it is during immersive session creation, which must happen on user activation and has extremely visible side effects, so spamming the test isn't practical. Also, since any errors won't tell you which features failed you can't reasonably sniff out more than one at a time, and that's only if your "bait" experience doesn't use anything but the core API. Additionally, on some browsers you may still get a consent prompt even when no features are requested. (We've talked about always having some basic consent process for SO, that means that you can potentially use a timing attack to infer feature presence if and only if:
At which point you can gain a fuzzy idea of a single bit of data. Which seems like... a lot? I'm having a hard time imagining a feature for which knowledge of it's presence (without ability to use said feature) is so valuable that someone would actually go through the effort? |
I agree with @blairmacintyre that there needs to be a way for the app to know which of the requested features are available, especially in cases such as environment meshes or even simple hit tests where actual data may be initially unavailable. On the implementation side, I'm a bit concerned about About "Not requesting access to RGB camera data until the user clicks a "take a photo" button", I think the app could simply ask for"RGB camera data" feature at session creation time, but the user could still have multiple ways to get a similar result:
Bigger picture, how important is it to support this runtime feature selection? Would it make sense to just start out with everything being requested at session creation time for the initial launch, and tackling dynamic features in a followup? If we can make exiting/re-entering sessions pretty quick and seamless, would that be good enough? Note that exiting/re-entering wouldn't necessarily involve taking off a VR headset, i.e. Daydream's VR browser makes this pretty seamless. @toji wrote:
I'm not sure how to parse that - did you mean "seems like a lot of effort to get a single bit of data"? |
To be clear, I was trying to say that the app should know if the session supports a feature, not that the browser/device supports a feature. I do not think that the session should indicate what features the device is capable of. So, a session would say But, a session might say |
I fully agree that this should all be based on per session support - if the
user (or user agent on behalf of the user) decide to keep a feature
disabled for a session, this should look the same from the app's
perspective as if the device inherently couldn't support the feature due to
hardware limitations. Also, an app should be able to deal with a capability
change when the user stops and restarts a session. This may be due to the
user examining an application first before deciding on granting advanced
environment access for a second session.
Conversely, I'd also be open to the idea that a user agent could claim
support for features that are actually emulated, i.e. 6DoF movement via
thumbstick locomotion for users who don't have enough space for roomscale
or have mobility restrictions. There may be cases where this would be
undesirable for an application, for example in a competitive multiplayer
game, but arguably this kind of problem already exists on the web platform
where the user could be running a modified browser.
|
It's not clear whether bespoke feature requests (after session creation) can be supported across all platforms in a way that gives developers a predictable cross-platform user experience. For example, different platforms may give the user agent different options for how to present a trusted interface, but those differences can result in significantly different user experiences:
This divergence could lead to significant challenges for cross-platform development because developers may not be able to predict how user consent will impact the user experience. For example, a developer who builds an experience on an HMD with an input device (and tunes the experience for those consent affordances) may not realize that the same experience on a tethered device might require the user to remove the headset for every instance of user consent. In extreme cases, a developer may require a bespoke feature that the user physically cannot consent to on certain platforms, rendering those platforms inoperable for those parts of the experience in ways that the developer cannot predict. More generally, because different platforms may introduce different levels of disruption or consent capability, it seems that the only pattern that a developer could safely use for cross-platform development is to request the desired features at time of session creation. Doing otherwise would give an unpredictable cross-platform experience with varying levels of discomfort for the user that the developer may not be able to predict. Further, bespoke feature requests lend to other concerns including:
@avadacatavra curious to get your thoughts here. |
Wanted to leave a quick comment thanking John for reviving this important issue with well researched recommendations. I'd also like to point out that one of his principal conclusions, that we should not allow for bespoke feature requests, directly contradicts one of my stated goals earlier in the thread. Despite that, after John graciously took the time to talk through the reasoning for that recommendation with me, I find that I agree with his conclusion, especially as we consider how varied the immersive hardware ecosystem is. I anticipate that, like me, other WG members will have concerns about front-loading consent for all of the features used by a session, and I'd encourage you to voice them. We want to make sure we're doing what we can to provide a reasonable support path for as many compelling use cases as possible while keeping users safe and informed, and we can only get there of we're considering a diverse range of UAs, devices, and experiences. |
@johnpallett You make a good point about the different ways various devices present a trusted interface. Given that, I agree that it's sensible to prompt for consent on session creation. I really like Blair's approach to presenting the permissions at one time, but separated, and allowing for easy toggle. One question I have---how would this work with immersive navigation? For example, I start an immersive session with foo.com, then navigate (while still in immersive mode) to bar.com. Would this trigger a new webxr session (thus causing the discomfort due to different trusted interfaces) or would this transfer the foo.com permissions to bar.com? |
@avadacatavra if the user navigates to bar.com and the immersive session will share data that requires user consent, then I believe there'd be the same discomfort due to different trusted interfaces (regardless of whether or not there is a new WebXR session). It may be that cross-origin navigation would suffer from the same discomfort irregardless of consent, though. Even if user consent was not required (for example, if bar.com had already received consent in the current browsing context), the user would presumably need some indication of what origin they were visiting from a trusted interface, otherwise any origin could pretend to be TrustedSite.com (even if the origin is actually BadSite.com) and solicit sensitive information. This is discussed to some degree in the navigation explainer and the navigation repo, but I don't believe there is a proposal for how to address discomfort and different types of trusted interfaces during cross-origin navigation. I've added navigation/#5 to make sure this question is captured. |
This issue is fixed by PR #739 |
Closed by #739 |
We've identified that there will be multiple features that WebXR will want to expose over time that will require some form of user consent. This subject has gone back and forth a LOT, and I'm not going to try to capture all of the discussion again here, but instead present my current thinking and iterate from there.
We want the request for these features to have certain properties (Partially pulled from #330):
The most straightforward way I see for addressing that is to simply pass the requested features into the session request call, either as dictionary keys or possibly and array of strings. For the moment I'm going to suggest the dictionary key route:
And for bespoke feature requests afterwards have an explicit function that takes in the same dictionary args (minus the session mode):
The UA should not reject the session if any of these features are either not supported or the user does not consent to giving access to them. Instead the session should still be created like normally and the mechanism for accessing the feature once enabled should be null/reject/report an error/whatever failure mode is appropriate. No differentiation should be made between the feature not being supported and the user denying consent to avoid leaking information about the users system without their permission. That way we're encouraging developers to be responsive to varying feature sets. If a feature is deemed to be absolutely critical for developers to identify prior to starting a session it should be handled by the mechanism discussed in #423.
To that end, it's natural to draw some parallels between this and #423, and as such it feels like they should both use the same mechanism for identifying features (and maybe even share enums) but at the moment I'm going to recommend against that. For one, use of dictionary keys makes it possible to make the permission requests more expressive. For example:
lightingEstimation: 'ambient'
instead of a simple boolean. Also, since the request model should allow session creation to still succeed even when there feature isn't supported, dictionaries provide the right implicit guarantee. Keys that aren't understood will be safely ignored. If we used an array of enums then passing an unrecognized enum would cause the call to fail. (Desirable for requirements, not here.) We could get around that by simply stating that an array of strings is passed instead, but I've already gotten feedback that's seen as a bit weird in terms of web platform ergonomics.The text was updated successfully, but these errors were encountered: