-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getCurrentBrowsingContextMedia #148
Conversation
@eladalon1983 fwiw, the IPR checker is flagging this because Google needs to rejoin the group - cf https://lists.w3.org/Archives/Public/public-webrtc/2020Oct/0005.html |
I filed an issue for this PR to be able to reference: #149 |
@dontcallmedom: Thanks for explaining. Until Google rejoins, do I want to skip the check by marking it non-substantive? Or possibly there's another way to avoid waiting for Google to rejoin? @henbos: Thanks. |
@eladalon1983 do you know how long it might take for Google to rejoin? if this is only short term, I think the simplest approach is to wait until it happens; if there is a risk that it might take longer, I'll suggest an alternative approach. |
I don't know how long it's going to take. Could potentially be a few days, I hear through the grapevine. |
Does this allow for capturing the browsing context or the document/global within it? What if it's navigated across the origin boundary for instance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authoring name is arlyss engebretson
@annevk, this allows capturing the browsing context, but I don't think navigation is an issue. If tab X captures itself, then navigates to another URL, then the app which captured tab X unloads and the capture ends. (I am not aware of a mechanism for ownership of the MediaStreams to be passed from the capturing application to another application/service-worker before it unloads.) |
In that case it doesn't capture the browsing context. The browsing context typically outlives a navigation. |
The browsing context outlives the navigation. It's the capturing entity which is unloaded when one navigates away, even though the captured entity remains. |
I don't think that's really true, since it seems to me you are capturing a rendered document. A browsing context is just a container for a sequence of documents. It doesn't really have the capacity to be captured. |
Implementation influencing API: The method of capturing done by GetDisplayMedia (and this version) involves grabbing the framebuffer after the browsing context has been rendered, I believe. So ithe capturing operation doesn't see any DOM object within the document. It's been suggested to put this API on the document object instead of on the navigator object, but I'm not sure that makes sense. |
Right, the theoretical model is such that documents get rendered. Don't really have a strong opinion on which object you put it, but I don't think this qualifies for adding "browsing context" as a web developer-exposed term. (To be clear, I understand @jan-ivar has other concerns with this API and if any of this is in conflict with him I'll happily defer.) |
TL;DR: * This is an API for capturing the current tab. * This CL handles the Blink part. Explainer: https://docs.google.com/document/d/1CIQH2ygvw7eTGO__Pcds_D46Gcn-iPAqESQqOsHHfkI Design doc: go/get-current-browsing-context-media Intent-to-Prototype: https://groups.google.com/u/3/a/chromium.org/g/blink-dev/c/NYVbRRBlABI/m/MJEzcyEUCQAJ PR against spec: w3c/mediacapture-screen-share#148 Next steps: * Implement the confirmation-box. * Implement unit-tests that rely on the confirmation-box. * Graduate this to an origin-trial. Bug: 1136942 Change-Id: I81333274075cd56d7e628a8a0eb025b1ae08645a Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2500841 Reviewed-by: Daniel Cheng <[email protected]> Reviewed-by: Guido Urdaneta <[email protected]> Commit-Queue: Elad Alon <[email protected]> Cr-Commit-Position: refs/heads/master@{#823498}
This is the second step in implementing getCurrentBrowsingContextMedia behind a runtime flag. TL;DR: This is an API for capturing the current tab. Explainer: https://docs.google.com/document/d/1CIQH2ygvw7eTGO__Pcds_D46Gcn-iPAqESQqOsHHfkI Design doc: go/get-current-browsing-context-media Intent-to-Prototype: https://groups.google.com/u/3/a/chromium.org/g/blink-dev/c/NYVbRRBlABI/m/MJEzcyEUCQAJ PR against spec: w3c/mediacapture-screen-share#148 Next steps: * Implement the confirmation-box. * Implement unit-tests that rely on the confirmation-box. Bug: 1136942 Change-Id: I8b25baa85565999ec44ed2f1b0bd1e19d6f148c4 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2502628 Reviewed-by: Guido Urdaneta <[email protected]> Reviewed-by: Yuri Wiitala <[email protected]> Reviewed-by: Robert Kaplow <[email protected]> Reviewed-by: Sergey Ulanov <[email protected]> Commit-Queue: Sergey Ulanov <[email protected]> Cr-Commit-Position: refs/heads/master@{#824534}
…edia API." This reverts commit 5aea604. Reason for revert: This CL is likely the cause of build failure for Linux ChromiumOS MSan Tests and Linux Chromium OS ASan LSan Tests (1) First occurance: https://ci.chromium.org/p/chromium/builders/ci/Linux%20ChromiumOS%20MSan%20Tests/21455 and https://ci.chromium.org/p/chromium/builders/ci/Linux%20Chromium%20OS%20ASan%20LSan%20Tests%20%281%29/38935 Failed tests: GetCurrentBrowsingContextMediaDialogTest.DefaultAudioSelection GetCurrentBrowsingContextMediaDialogTest.DoneCallbackCalledWhenWindowClosed GetCurrentBrowsingContextMediaDialogTest.DoneCallbackCalledWhenWindowClosedWithoutCheckboxTicked GetCurrentBrowsingContextMediaDialogTest.DoneCallbackCalledWithAudioShare GetCurrentBrowsingContextMediaDialogTest.DoneCallbackCalledWithAudioShareFalse GetCurrentBrowsingContextMediaDialogTest.DoneCallbackCalledWithNoAudioShare GetCurrentBrowsingContextMediaDialogTest.DoubleTapOnShare GetCurrentBrowsingContextMediaDialogTest.ShareButtonAccepts Original change's description: > Implement the confirmation-box for getCurrentBrowsingContextMedia API. > > Rebased on top of https://chromium-review.googlesource.com/c/chromium/src/+/2502628 > > UI without audio capture: https://drive.google.com/file/d/1SA9vuDOkQjnioBfmAaiOjqoXGBUmVw22/view?usp=sharing > UI with audio capture: https://drive.google.com/file/d/1jcncgHsF6L_o3D5Jc3UJ6pkjwQAKtoVl/view?usp=sharing > > This change relates to the UI code that is added to support the new getCurrentBrowsingContextMedia API. > > This is an API for capturing the current tab. > > Design doc: > go/get-current-browsing-context-media > > Intent-to-Prototype: > https://groups.google.com/u/3/a/chromium.org/g/blink-dev/c/NYVbRRBlABI/m/MJEzcyEUCQAJ > > PR against spec: > w3c/mediacapture-screen-share#148 > > > Bug: 1136942 > > Change-Id: I8e72023d944df3d7e996ad3acea7527c34569868 > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2489991 > Commit-Queue: Palak Agarwal <[email protected]> > Reviewed-by: Guido Urdaneta <[email protected]> > Reviewed-by: Peter Boström <[email protected]> > Reviewed-by: Elad Alon <[email protected]> > Reviewed-by: Elly Fong-Jones <[email protected]> > Cr-Commit-Position: refs/heads/master@{#831017} [email protected],[email protected],[email protected],[email protected],[email protected] Change-Id: I25b9e79d7eb61b5e43961df61999fd8c20954c8f No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 1136942 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2560358 Reviewed-by: Maggie Cai <[email protected]> Commit-Queue: Maggie Cai <[email protected]> Cr-Commit-Position: refs/heads/master@{#831228}
I agree with @annevk here on naming. The document is the largest object being captured.
@eladalon1983 When we say something "captures itself", that something is the document, not the tab. →
@alvestrand Ah, I couldn't figure out why the API allowed an iframe to capture its embedder, when none of the use cases require it. → I'd humbly suggest an iframe only capture itself. A simpler story, and a useful cropping tool to boot. For implementations, cutting out the relevant rectangle from the framebuffer is hopefully not too difficult. |
An iframe only capturing itself would not cover some interesting use cases.
For example, a document being presented to a VC, by embedding an iframe of
the VC's application inside the document-editor application. Or a game
capturing footage of itself by embedding an iframe which also contains code
and visible controls for managing the capture, annotating it, and uploading
it to a remote server, possibly one which streams it to remote viewers.
…On Mon, Nov 30, 2020 at 8:08 PM Jan-Ivar Bruaroey ***@***.***> wrote:
I agree with @annevk <https://github.com/annevk> here on naming. The
document is the largest object being captured.
If tab X captures itself, ...
@eladalon1983 <https://github.com/eladalon1983> When we say something
"captures itself", that something is the document, not the tab. →
getDocumentMedia.
Implementation influencing API: The method of capturing done by
GetDisplayMedia (and this version) involves grabbing the framebuffer after
the browsing context has been rendered, I believe.
@alvestrand <https://github.com/alvestrand> Ah, I couldn't figure out why
the API allowed an iframe to capture its embedder, when none of the use
cases require it. → getTopLevelDocumentMedia would also be a mouthful.
I'd humbly suggest an iframe only capture itself. A simpler story, and a
useful cropping tool to boot.
For implementations, cutting out the relevant rectangle from the
framebuffer is hopefully not too difficult.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#148 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFIX22H2OPTYPG5XVQ6PDBDSSPUSXANCNFSM4SHNXW3A>
.
|
Isn't the document of the frame being captured there? Or do you mean it would include nested documents? Again though, "browsing context" doesn't capture that at all. They're just an abstract holder of a sequence of documents, only one of which is currently active. (And depending on how history is revamped that model might change a bit still.) |
I leave discussions of the API's name to those with more experience in
this field than me. I have no strong opinions in this matter, other than a
mild preference for brevity.
…On Tue, Dec 1, 2020 at 9:31 AM Anne van Kesteren ***@***.***> wrote:
Isn't the document of the frame being captured there? Or do you mean it
would include nested documents? Again though, "browsing context" doesn't
capture that at all. They're just an abstract holder of a sequence of
documents, only one of which is currently active. (And depending on how
history is revamped that model might change a bit still.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#148 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFIX22GBWGEOOT4YQJFHTS3SSSSUHANCNFSM4SHNXW3A>
.
|
@eladalon1983 A security property I like about "capture itself", is requiring explicit code in the capture target. Easy to grasp, ensures buy-in of the target, and no-one has to worry about being captured by other origins (except through the existing hyper-user-driven I was even wondering about enforcing this by checking await iframe.contentWindow.navigator.mediaDevices.getDocumentMediaWhatever() // SecurityError
I don't understand this use of "itself". Can't it put code in the parent as well, use postMessage etc? We seem to be making much stronger assumptions about the capture target's involvement elsewhere, e.g. "Since the app can take for granted that the captured content is of itself, it knows how to crop sensibly". I feel strongly we shouldn't be over-capturing, only to create a need for inventing cropping APIs next. I think we can lean on apps to use iframes to capture exactly what they want to send and no more. This is the web after all, so I'd aim for stronger integration than the existing modality. |
Cropping is a different issue that we happen to be interested in, and only one example of why it's useful for the application to know that it is capturing its own tab. Shelving it for the time being, let's please examine the scenario of a game running in the browser, and wanting to stream itself to a service like Twitch. The streaming service could "publish" an iframe or a script that can be embedded in a game, implementing that functionality. The alternatives I can think of are less preferable. They include:
(a) Each game re-implementing the streaming capability, probably by importing the streaming service's code into their own codebase. I don't think that's a reasonable alternative, as the copied code would be running *same-origin*, and would have to be scrutinized before being imported; likely it either won't be scrutinized, or it would be running an older, already-scrutinized revision; also not very secure.
(b) Passing frames and audio via postMessage, which would be prohibitively inefficient.
Also, since iframes can embed additional iframes, I am not sure that "capture only this iframe" solves the issue of isolating what is captured. I think an opt-out mechanism for being captured by one's embedder would still be necessary. And once such a mechanism is provided (for other readers - I have suggested such a mechanism elsewhere), I think it can be used for making a safe implementation of capture-entire-tab.
|
I am still unclear about the goal of the API, which makes it hard to discuss the API surface. |
With a prompt. There is a mock in the explainer
<https://docs.google.com/document/d/1CIQH2ygvw7eTGO__Pcds_D46Gcn-iPAqESQqOsHHfkI/edit?usp=sharing>.
The UA is of course free to make the dialog even more "frictive," e.g. by
presenting a preview image of the tab that will be captured in the dialog,
possibly also requiring that it be selected prior to pressing the "Share"
button (similarly to what Chrome currently does for selecting a
full-desktop capture).
…On Tue, Dec 1, 2020 at 6:48 PM youennf ***@***.***> wrote:
I am still unclear about the goal of the API, which makes it hard to
discuss the API surface.
Either we are talking of a privileged API, thus there is a prompt
somewhere. In that case, we should investigate how much different it would
be from getDisplayMedia, how the UI would be more intuitive and so on.
If we are talking about a no-prompt approach, this is another story where
API could be at element level for instance like fullscreen, and we could
constrain the element properties.
Can somebody clarify which approach is actually envisioned?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#148 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFIX22A4QQZ24A3MFLE3DHTSSUT7DANCNFSM4SHNXW3A>
.
|
I don't think we should introduce artificial reasons to crop, nor assume cropping will ever be added as a feature since it's a slippery slope to image processing, something this WG appears to be leaning more toward raw media access to solve.
Exactly. To protect users from dubious information-harvesting JS libraries, I think I'd prefer this to receive the same level of scrutiny that a service provider performs to protect itself. I don't think we should make it easier to export user trust to entities the service provider itself doesn't trust, because if a service provider doesn't trust a library then users probably shouldn't either. |
In the scenario described, the service provider trusts the (specific) third-party enough to (1) embed it and (2) provide its iframe with allow=display-capture. The service provider should not, IMHO, be forced to the decision of either not trusting the (specific) third-party it at all, or trusting it enough to allow it to run same-origin.
|
I'd also like to clarify that cropping is not the main issue with capture-this-tab, but rather just an example. I see the advantages of capture-this-tab as being the following, and in this order:
1. [Main advantage:] Nearly eliminates the risk of users sharing the wrong thing (choosing the wrong thing in a chooser). IMHO, this is a big selling point.
2. [Important, but not main advantage:] Allows the application the knowledge of the contents of the capture, allowing processing of any sort, cropping (even in JS) being just one example. Other (hypothetical) examples include blurring part of the captured stream without blurring what the local user sees, censoring private information that's in the middle of the captured content (can't be cropped) without obscuring it for the local users, etc. This is just a short list of hypotheticals I could think of, however.
3. [Lower importance:] Provides a streamlined interface for sharing, making both users' and developers' lives easier. Contrast the "share this document" flow using getCurrentBrowsingContextMedia with the flow using getDisplayMedia. With getDisplayMedia, if the user chooses the wrong source in the picker, the application has to (a) detect it, (b) discard the MediaStream and (c) explain to the user what mistake he had made, and how to avoid it in the future, then (d) prompt the user to try again. Cumbersome.
|
@eladalon1983 but this feature undermines many of the same-origin protections you get from iframing in the first place.
Right, and this knowledge adds risk. Whatabout getDisplayMedia?While I think it's fair to say I do however find the idea of enabling apps to stream themselves into web conferences appealing, provided it can be integrated safely. |
It's also worth pointing out that the definition of browsing context is somewhat in flux (precisely because it's not web-developer exposed); we're currently working on transitioning that single concept into three: browsing context, browsing session, and top-level navigable. You can learn more about that in whatwg/html#5767 and whatwg/html#6356. |
Chrome Security is of the opinion that a confirmation-only flow for gCBCM (getCurrentBrowsingContextMedia) would require security measures similar to those which @jan-ivar has suggested. Namely, a combination of (a) site-isolation and (b) a new COEP-like header for opting in to capturability by embedder. I think we have the following issues to resolve, then: NameWe are open-minded about this. I personally like Security MeasuresI think we can continue this discussion in thread #155. Behavior when security-measures do not holdThe application might try to call gCBCM from almost any context. If gDM is not allowed to be called from that context, then gCBCM should also not be permitted. (For example, calling gCBCM from an iframe that does not have the prerequisite display-capture permission from its embedder.) gCBCM introduces new requirements in addition to gDM’s - (a) site-isolation and (b) a new HTTP header (definition pending in separate thread). If either of these conditions does not hold at the time when gCBCM is called, we would like to specify that the user agent SHOULD (or MAY) fall back to gDM-like behavior. That is, display a dialog that does not limit the selection of the by the user to just this tab. Rationale for Fallback One possibility for applications is to inspect the failure reason of gCBCM calls, and if it’s due to the missing header, to call getDisplayMedia “manually” as a fallback. This is possible, but clunky. It’s also arguable that exposing the exact reason gCBCM was rejected is undesirable. (Normally, a top-level application would not be allowed to see what HTTP headers embedded content uses, let alone embedded content twice-removed. Normally, things either load or don’t.) We believe that it will be helpful if we specify that the user agent SHOULD default to a gDM-like dialog (or possibly MAY). The only new problem introduced - uncertainty by the application over whether the user really chose the current tab - can be resolved in several ways. It may be left to the application (e.g., pixel test), or we could discuss more ergonomic solutions - using a returned value, using the label of MediaStreamTrack (Firefox currently uses windows’ titles as the label), etc. Audio Playback SuppressionI will be using “playback-suppress” as shorthand for “mute the audio from the speaker’s point-of-view, but still make this audio capturable.” While a user is capturing audio from a tab, it’s sometimes useful to prevent that tab from performing audio playback through the speakers. This is useful, for example, for performing echo cancellation, which works better if the audio captured on the playback-suppressed tab is sent back to some like is other participants’ audio, and is seen by the echo canceller as just another remote-sourced audio stream played out over the speakers. I see some options here:
|
Name: I think the concerns raised with browsing context apply equally to tab. This is much more narrow-scoped than a tab. "Page" might be okay and has some precedent in CSS. (Also, thanks for the update!) |
@annevk I think this is a case where "tab" is narrower-scoped than page, because a page may have a much larger surface area, and we only want to capture what the user sees at the moment. That is: the intersection of the top-level browsing context's viewport and the rendering boundaries of the requesting document, including any content overlaid by CSS (i.e. excluding any content 100% occluded by CSS at the moment): ...OR (to complicate matters) depending on the outcome of the above discussion on letting an iframe capture its parent: In either case, this seems best for both privacy and efficiency (we don't want to have to re-render a page for this). If the user scrolls the page they may reveal more info. So I've been calling this getTabMedia. Though perhaps getCurrentTabMedia is more precise? — In either case, the requester cannot outlive its target, so I wouldn't worry about "tab" implying capture past navigation. |
I'd rather keep this separate from gDM, even using a separate permissions policy, since the security properties are quite different. The callsites may not even be the same always. E.g. this is the target being told to capture itself and beam into a meeting, vs the main window where gDM may be called to present today. I see no way around apps needing to check for errors. Especially if we go with the model where capture may terminate on a non-opt-in iframe loading. Apps would need to catch that error too, and respond appropriately. |
That's a fair criticism of page, but I don't think tab really captures it either. This doesn't match the lifetime of a tab and tab is rather implementation-specific and might not exist on all platforms. (Also, tab in implementations is rather analogous to top-level browsing context (or browsing session, once we have that) and this clearly isn't that.) |
re naming, wouldn't viewport be a better characterization of the target of the capture? |
Yeah, I think viewport would work, though with the caveat that in theory nested documents have their own viewport and I don't think there is a clearly defined term for the composition of them. |
The lifetime of a capture target >= lifetime of a capture. Sites today can capture the display (with getDisplayMedia) and the user (with getUserMedia), both of which (hopefully!) outlive the page and its capture of them. I don't feel users or developers are confused by that. getViewportMedia could work, but is it distinct enough from getDisplayMedia? FWIW, "screen", "window", and "tab" are the layterms around screen-sharing UX in browsers today. AFAIK, screen-sharing isn't available yet on mobile, but I believe the term "tab" exists there as well as an organizing container/layterm for browsing context. My objection to "browsing context" wasn't its scope, but it being technical under-the-hood term previously unexposed in the platform. |
My objection is the scope. 😊 Both screen (has precedent with (To add, I wouldn't find it problematic to expose |
Fair enough. Of those I think I'd pick viewport, to emphasize we're not necessarily capturing the whole document. |
I also like viewport. There are certain browser UI elements that are bound to specific tabs, but which do not get captured. The developer console, for instance. Using "tab" does not make immediately clear that those are not captured. So I assume it's |
A precedent for the term "viewport" being exposed to the web is window.visualViewport. I guess it remains to be seen whether we'd capture the layout viewport or the visual viewport on mobile. |
Requiring a separate permission policy is fine by me. Let's assume that Rationale - we expect this to happen >99% of the time, at least in the early days, and we don't believe the feature will be useful without it. The compromise that Chrome has reached internally between the demands from Security and the needs of potential feature-customers, is that a confirmation-only dialog is displayed if all of the new security requirements are satisfied, and an explicit-selection dialog is shown otherwise, which is generally gDM-like, but highlights in some spec-compliant way that the application would like to get the current tab. (For example, consider a UA that normally offers windows as the first option if gDM is called, but offers tabs as the first option if gVM-fallback-mode is used.) At the bottom of my comment is an illustration of what Chrome thinks of using. I mention Chrome-specifics only so as to explain our motivation. Spec-wise, Chrome's specific dialog is of course out of scope. For the spec-change, I think the right way to go about it is to say that the user agent SHOULD/MAY fall back to any behavior that complies with the restrictions placed on gDM, but that this behavior MAY differ from the specific UA's usual gDM behavior. (Or maybe we can leave this "MAY differ..." part implicit.) Lastly, we can make this fallback behavior temporary, giving sites time to adopt the security requirements that we introduce. |
This would make getViewportMedia a weakened version of getDisplayMedia, which seems problematic. I don't think we can infer that an app calling one API wants to fall back to calling the other in all circumstances. This seems app-specific, and a few lines of code: let stream;
try {
stream = await navigator.mediaDevices.getViewportMedia();
} catch (e) {
if (e.name != "SecurityError") throw;
stream = await navigator.mediaDevices.getDisplayMedia(); // ¹
} This would be well-tested, because as you say: "we expect this to happen >99% of the time, at least in the early days". 1) If Chrome wants to weaken the already strained security properties of getDisplayMedia, it can do so here without melding APIs together, by ignoring spec recommendations and detecting this situation. |
|
From the spec: "User Agents are encouraged to warn users against sharing browser display devices as well as monitor display devices where browser windows are visible, or otherwise try to discourage their selection on the basis that these represent a significantly higher risk when shared.". See also crbug 920752.
I'm merely making the point that it's completely doable, from the same justifications made for wanting to standardize it. Except Chrome would (hopefully) be alone in making this convenience/safety tradeoff. So any definition would do, e.g. the same run of the event loop. If Chrome would rather not do it because it's non-standard, that would be understandable, and desirable from my point of view. I'd be opposed to standardizing any parameter related to this, because I think it's bad for privacy for the reasons stated.
We are standardizing gVM specifically to serve this need. I see no new information to revisit gDM. |
The attacks we have discussed so far all required a single frame to perform. A malicious application can preload occluded cross-origin iframes and flash them to the screen for the duration of a single frame immediately after the user approves screen-capture. As soon as the user approves, it becomes too late to hide anything from the app. Switching tabs, minimizing windows, etc. - such steps do not offer protection from a malicious app. The decisive moment is when the user accepts. Currently, Safari offers only the entire screen; Chrome and Edge offer screen/window/tab, with the first option on offer being screen. Most users have a single screen, and it's showing the current tab at the moment capture starts. Any danger that exists with capturing the current tab, also exists when capturing the current screen - and more (e.g. see titles of other tabs). Dialogs offering unconstrained choice to the user, but with focus moved away from current-screen towards current-tab, are more secure than dialogs that push towards sharing the entire screen. Helping browsers move to more secure options creates a more secure Web. In order to be implemented, it helps if work on If you can help me find a variation that satisfies everyone¹, or that can be an acceptable compromise for everyone, I would be very grateful. This can include any old/new idea, or any temporary compromise. I believe it will also be good for security and privacy on the Web. ¹ Including one customer for this feature which will only be able to adopt COOP+COEP in the mid-term future, and the new header only in the long-term future. And this customer is the motivation for our investment of headcount in this. |
I don't think falling back to gDM when gVP fails is a good approach:
Regarding the fact that sharing a full-screen is as or more dangerous than sharing a tab in a single screen scenario, I think part of the reasoning is that users understand much better than sharing your entire screen is potentially scary, whereas they might think it is benign to share the tab that is asking for screen sharing. So it is not that it is safer, but that users will have more accurate understanding of the risks. And conversely, because developers might expect users will be offered a scary choice, it makes it a less attractive option for attackers. Separately, we've heard several times that the current lack of ability for developers to guide the capture surface in gDM leads to suboptimal UX - I wonder if we should look into reinstating a way for developers to give a hint, which UAs could choose whether and how to take into account (e.g. based on previous interactions of the user with the site, maybe based on the cross-origin isolation status of the tab if a tab is being requested, …) - but that would need to be a separate discussion. |
As of gDM as a back-up, I think we should first let websites experiment with it themselves. UAs can also learn from it as well. It is easier to add additional parameters later if we think there is value based on that. I see two requirements from that discussion:
Before going down that road, I think UA implementors should investigate what they can do on their own. |
@eladalon1983 I showed how this customer can accomplish their workflow with a few lines of JS, and how browsers that want to can detect and optimize the UX flow in that case, even though we don't recommend it. Can you help me understand why they'd need this behavior standardized for all apps? As to browser UX, specs have a hard time mandating it. Where strong recommendations have had teeth in the past, they're paired with good privacy or security arguments, to convince (or shame) implementers with. So proposing a parameter to induce UX (going against strong privacy recommendaetions) seems like a dead-end.
@youennf Agreed, among a list of other things.
Transient activation is time-based, so we may not need to say anything about it since gvp would fail immediately. |
I do not see how it can be guaranteed today with a time-based approach. The fact that gvp fails immediately is an implementation decision. Asynchronous queries might be required to actually make it fail. Even if it fails synchronously, there might be some time spent in doing this computation. |
(One more suggestion, probably the last one on this topic. The main difference here is that all security restrictions now always apply.) What if
With this, the standard does not allow gVM to become a different version of gDM. Rather, it allows gVM to become a normal version of gDM. It makes the process more user-driven. Wdyt? Here is a mock where Here is a mock with Important differences from previous suggestion:
|
Let's close this and move getViewportMedia to https://github.com/w3c/mediacapture-screen-share/issues/155 since we've reached WG consensus (slide) to site-isolate the API. This issue gets credit for birthing getViewportMedia, but has become a source of confusion. The (early) name "getCurrentBrowsingContextMedia" is both a Chrome origin trial (without site-isolation), and now also a different competing "hybrid" picker-based API proposal from Google without support from this WG. |
TL;DR: * This is an API for capturing the current tab. * This CL handles the Blink part. Explainer: https://docs.google.com/document/d/1CIQH2ygvw7eTGO__Pcds_D46Gcn-iPAqESQqOsHHfkI Design doc: go/get-current-browsing-context-media Intent-to-Prototype: https://groups.google.com/u/3/a/chromium.org/g/blink-dev/c/NYVbRRBlABI/m/MJEzcyEUCQAJ PR against spec: w3c/mediacapture-screen-share#148 Next steps: * Implement the confirmation-box. * Implement unit-tests that rely on the confirmation-box. * Graduate this to an origin-trial. Bug: 1136942 Change-Id: I81333274075cd56d7e628a8a0eb025b1ae08645a Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2500841 Reviewed-by: Daniel Cheng <[email protected]> Reviewed-by: Guido Urdaneta <[email protected]> Commit-Queue: Elad Alon <[email protected]> Cr-Commit-Position: refs/heads/master@{#823498} GitOrigin-RevId: bc949e9d94ea6496b15153f5486a12608db7152b
getCurrentBrowsingContextMedia is equivalent to getDisplayMedia, other than that it may only capture the tab from which it is called. This allows for a simpler selection to be displayed for the user - rather than an elaborate picker, a simple dialog box is used. This simplifies things for the user and reduces the risk of the user sharing something other than what they had intended.
See also:
We think that the security properties of own-tab capture are no worse than the version that goes via the picker. We note that the application will have control over the surface that is being displayed, and that can cause some sharing of information that would otherwise be inaccessible, such as colors on visited links, or content of embedded frames, but we think that the risks are no bigger than for regular sharing, and that the proposed, simple, prompt is good enough to mitigate this concern.
This new API will be subject to access-permissions laid down by the display-capture feature-policy. (Support will be added in Chrome for this feature-policy as part of the work on this feature.)