Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit consent via getUserMedia should allow access to non-miked speakers #147

Open
guidou opened this issue Oct 21, 2024 · 14 comments
Open
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.

Comments

@guidou
Copy link
Contributor

guidou commented Oct 21, 2024

The spec says getUserMedia MUST grant access to miked speakers and leaves out non-miked speakers.

A common use case is a desktop user who wants to use a non-miked headphone together with the laptop's built-in microphone.
After getUserMedia, we shouldn't require selectAudioOutput, which requires additional prompts and is a relatively new API, for that common use case.

I propose we add a "MAY grant access to non-miked speakers" to support this use case, so that UAs that don't want to support this use case are not required to implement it.

More important, this was allowed by previous versions of this spec and was (and is still) supported by Chromium, which for many years was by the only available implementation of this API.

@guidou guidou changed the title Implicit consent getUserMedia should allow access to non-miked speakers Implicit consent via getUserMedia should allow access to non-miked speakers Oct 21, 2024
@guidou
Copy link
Contributor Author

guidou commented Nov 7, 2024

cc @jan-ivar @youennf

@guidou
Copy link
Contributor Author

guidou commented Nov 29, 2024

See also https://bugzilla.mozilla.org/show_bug.cgi?id=1868750

This is quite a normal/common use case for our users who have both independant microphones and independant speakers and also for home users who want to use the in-built speakers from their HDMI monitor.

@youennf
Copy link
Contributor

youennf commented Nov 29, 2024

I agree this is a common use case.
The current status of the spec is:

  1. Use enumerateDevices/setSinkId if you want to consistently use a device offering both microphone and speaker
  2. Use selectAudioOutput whenever a picker approach is preferred.

The question is whether we are fine with this approach (which is aligned with our privacy guidelines), or we prefer a dual approach where a page can use an in-house speaker picker if using microphone and use the UA picker otherwise.

@guidou
Copy link
Contributor Author

guidou commented Nov 29, 2024

The argument in the spec to give implicit consent to miked speakers is:

This conveniently handles the common case of wanting to route both input and output audio through a headset or speakerphone device

The other use cases are just as common, so I would say the argument applies there too.

In terms of privacy, I favor the dual approach (in-house picker if using mic, UA otherwise). The reason is that the microphone permission is already a lot more sensitive than the speaker one in terms of privacy/fingerprinting, and the extra privacy issues that would be caused by the exposure of potential output-only devices looks minimal, given that some output devices are already being exposed anyway.

Also, in terms of compatibility, an earlier version of this spec did allow exposing non-miked speakers and was deployed on the Web.

@dontcallmedom dontcallmedom added the privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. label Dec 10, 2024
@jan-ivar
Copy link
Member

The reason is that the microphone permission is already a lot more sensitive than the speaker one in terms of privacy/fingerprinting

The spec exposes microphones and (miked) speakers on microphone use, not permission. We need to resolve w3c/mediacapture-main#1019 first to ensure agreement on the privacy properties speakers would end up with here.

@youennf
Copy link
Contributor

youennf commented Dec 10, 2024

To be clear, the proposal made during today's WebRTC meeting was to expose all speakers after the page started to use microphones.

@guidou
Copy link
Contributor Author

guidou commented Dec 11, 2024

To be clear, the proposal made during today's WebRTC meeting was to expose all speakers after the page started to use microphones.

My proposal is that whenever you're exposing miked output devices, also allow including non-miked devices. But I'm OK with not making it mandatory.
This is orthogonal to deciding how the site is authorized to expose the (miked or non-miked) output devices in the first place.

@youennf
Copy link
Contributor

youennf commented Dec 11, 2024

@guidou, the question is whether you think Chrome could align with the spec if the spec allowed all speakers. It seems a major blocking point is exposing an exhaustive list of cameras before cameras were used (the Zoom issue).
For speakers/microphones, the situation might be different.

@guidou
Copy link
Contributor Author

guidou commented Dec 11, 2024

@guidou, the question is whether you think Chrome could align with the spec if the spec allowed all speakers.

What spec? If you're referring to mediacapture-output, if it allowed all speakers Chrome would be compliant already. Note also that a previous version of the spec allowed all speakers, and that is what Chrome implements since 2015.

It seems a major blocking point is exposing an exhaustive list of cameras before cameras were used (the Zoom issue). For speakers/microphones, the situation might be different.

Why would it be different between cameras and microphones?
I get that speakers is a different case, because the issue is about what speakers are implicitly authorized after microphone access is authorized.

@guidou
Copy link
Contributor Author

guidou commented Dec 11, 2024

The spec exposes microphones and (miked) speakers on microphone use, not permission. We need to resolve w3c/mediacapture-main#1019 first to ensure agreement on the privacy properties speakers would end up with here.

This issue is not about the mechanism to authorize enumeration of devices (use or permissions or whatever), but about allowing (not mandating) exposure of non-miked speakers once exposure of miked speakers is authorized, regardless of the mechanism for such authorization.

Solving w3c/mediacapture-main#1019 is obviously not needed to solve this. The proposed solution does not require any changes to any current browser implementation, but would give extra flexibility to future implementations.

Also, if a future solution of w3c/mediacapture-main#1019 requires changes to mediacapture-output, there is no reason why we cannot make such a change too.

@youennf
Copy link
Contributor

youennf commented Dec 12, 2024

Why would it be different between cameras and microphones?

Some websites allow to enter a call without camera capture.
On the other hand, web sites are usually asking for microphone access more quickly (maybe given the lack of the green pill as for cameras, or maybe because they want to be able to detect voice activity).

For instance, both Zoom and Google Meet keep microphone capture live when user decides to mute microphone but stop camera capture when user decides to mute camera.
According my testing, even though Zoom might not always access camera when joining a call, it will consistently access microphone.

This leaves the possibility for a simpler compat story for microphones (and so speakers) than for cameras.

@guidou, have you made an assessment specific to speakers?

@guidou
Copy link
Contributor Author

guidou commented Dec 12, 2024

Why would it be different between cameras and microphones?

Some websites allow to enter a call without camera capture. On the other hand, web sites are usually asking for microphone access more quickly (maybe given the lack of the green pill as for cameras, or maybe because they want to be able to detect voice activity).

For instance, both Zoom and Google Meet keep microphone capture live when user decides to mute microphone but stop camera capture when user decides to mute camera. According my testing, even though Zoom might not always access camera when joining a call, it will consistently access microphone.

This leaves the possibility for a simpler compat story for microphones (and so speakers) than for cameras.

This is a valid point, but it doesn't necessarily have to be that way forever and it's orthogonal to this issue (non-miked speakers). For example, one reason for keeping the microphone open is to detect when the user is speaking so that the application can remind the user that they're muted. APIs's like MediaSession's voiceactivity and other future APIs might make it unnecessary for applications to need access to the microphone signal for these use cases.

@guidou, have you made an assessment specific to speakers?

The point about speakers is that users want to be able to select non-miked speakers in many cases. It's common for VC users to use a laptop microphone together with a non-miked headphone or non-miked monitor speakers. Users get confused If the application picker shows some speakers, but not the ones they want to use. Then they file bugs like https://bugzilla.mozilla.org/show_bug.cgi?id=1868750

@youennf
Copy link
Contributor

youennf commented Dec 12, 2024

This is a valid point, but it doesn't necessarily have to be that way forever and it's orthogonal to this issue

I understand this but we were not able to reach consensus at last meeting because of w3c/mediacapture-main#1019.

I am hoping that we can remove speakers (maybe microphones as well) from the scope of w3c/mediacapture-main#1019, since I believe this is all about compat here.

That way, we can make progress quickly here and we will have more fruitful/focused discussions on w3c/mediacapture-main#1019.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.
Projects
None yet
Development

No branches or pull requests

5 participants