Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Clipboard - image/delayed content #350

Closed
3 of 5 tasks
garykac opened this issue Mar 9, 2019 · 21 comments
Closed
3 of 5 tasks

Async Clipboard - image/delayed content #350

garykac opened this issue Mar 9, 2019 · 21 comments
Assignees
Labels
Resolution: timed out The TAG has requesed additional information but has not received it Review type: later review Topic: powerful APIs APIs that reach into your life. Venue: Web Apps WG W3C Web Applications Working Group

Comments

@garykac
Copy link

garykac commented Mar 9, 2019

Góðan TAG !

I'm requesting a TAG review of:

Further details (optional):

You should also know that...

  • The Clipboard API spec is large and describes the old "Event API" and the newer "Async API". For this TAG review, we're interested in feedback on the read() and write() methods of the new Async API (although comments on other parts are welcome).
  • We did a previous TAG review for the Text-only portion of the Async API. That TAG review is here.
  • At this point, we're primarily interested in making sure that the shape of the read/write API looks good so that we can write more detailed descriptions.
  • For the previous TAG review, I called in on the review meeting so that I could answer questions directly. Let me know if you'd like to do that again.

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our Github repo for each point of feedback
  • open a single issue in our Github repo for the entire review
  • leave review feedback as a comment in this issue and @-notify [github usernames]

Please preview the issue and check that the links work before submitting

For background, see our explanation of how to write a good explainer.

@annevk
Copy link
Member

annevk commented Mar 11, 2019

As I realized a little late during the TPAC meeting, I don't think we should move ahead with this without figuring out the drag & drop counterpart. These APIs share a lot of infrastructure and should be co-evolved.

@dway123
Copy link

dway123 commented Mar 12, 2019

We choose to defer decisions around drag-and-drop, even though it does share common infrastructure, so that we can decrease the scope and time of this explainer/review/implementation. That said, this spec'ed feature is compatible with the current web drag-and-drop API, and could easily expand to cover drag-and-drop in a future explainer/spec. I've noted that we have explored this (and other alternatives) in the explainer on the last page. Thank you for catching that we forgot to note the alternatives we had explored.

Are there any specific concerns or risks you're concerned about? I can try to address them either here or in the explainer.

@annevk
Copy link
Member

annevk commented Mar 14, 2019

Another thing here, during the F2F where this was discussed https://www.w3.org/2018/10/23-WebPlat-minutes.html#item02 there were objections from Apple and Mozilla representatives to doing this without standardizing some kind of pickling format. Perhaps the image re-encoding can be standardized somehow, but the explainer doesn't go in much detail and https://w3c.github.io/clipboard-apis/#dom-clipboard-write even less.

It's also not clear to me that it's a good idea to do away with the current API entirely, that's shared between drag & drop and copy & paste. Perhaps if both APIs were redone in a shared model it'd be easier to judge that nothing problematic is happening and there is indeed a natural extension path.

@garykac
Copy link
Author

garykac commented Mar 18, 2019

Re: Drag and Drop

There isn't any real shared infrastructure other than the need for a dictionary of { mimetype -> data }. Beyond that, the two APIs are very different and touch different parts of the UA implementation.

  • Clipboard: access system clipboard + related permission and security issues (re: decoding malicious content) + dict(mimetype->data)

  • DnD: lots of UI events relating to the dragging + permission/security(?) + dict(mimetype->data)

Note that any permission/security issues for DnD (to address malicious content) will be different than for Clipboard (since there is a guaranteed user gesture, but can't easily show a prompt).

Re: DataTransfer

DataTransfer is only used by the Clipboard Event API to get the aforementioned dictionary of { mimetype -> data }. Beyond that it contains a collection of DnD-specific attributes (dropEffect, effectAllowed, dragImage) that are not useful for Clipboard.

Our approach is to have a ClipboardItem type that contains a standard dictionary. This allows the clipboard data to be created more easily (esp. when compared to a series of DataTransferItemList.add() calls).

I've updated the explainer with a section about DataTransfer.

Re: Pickling

The pickling conversation was in the context of making custom mimetypes and raw (non-transcoded) image data available on the native clipboard. We intend to document this format, but that is out-of-scope for this current phase.

Sanitization/transcoding

We want to leave the exact details of the transcoding up to the UA, but I need to make that explicit in the spec. Thanks for pointing that out.

Re: "do away with the current API"

The only clipboard API that actually uses the DataTransfer object is the Event-based API, and no changes are being made to that. The Async Clipboard API has never used DataTransfer (although we did try to make it work), so we're not removing any functional API.

@annevk
Copy link
Member

annevk commented Mar 19, 2019

other than the need for a dictionary

What about data sanitization? That data structure is quite a bit more complex than a map. And sure, there are some differences in the interaction with that data structure, but it's the shared primitive that is being touched here.

We want to leave the exact details of the transcoding up to the UA

I don't think that's acceptable.

@plinss plinss changed the title TAG review: Async Clipboard - image/delayed content Async Clipboard - image/delayed content Mar 26, 2019
@dway123
Copy link

dway123 commented Apr 16, 2019

While data sanitization is a shared need, Clipboard and Drag-and-Drop do have many more unrelated needs, as garykac@ mentioned, such as differing UI events, controllers, and security models.

The specification mentions transcoding as a way to improve security. We don't plan to specify exact algorithms for the following reasons:

Doing so would effectively force us to unnecessarily enclose large parts of the relevant image format specifications.
Having a precise transcoding algorithm may preclude hardware-accelerated implementations.
However, we will clarify that the transcoding process will output image data conformant with the format's specification.

@annevk
Copy link
Member

annevk commented Apr 16, 2019

Such an approach forces other implementers to reverse engineer, right?

@pwnall
Copy link

pwnall commented Apr 16, 2019

@annevk Could you explain what & why other implementers would need to reverse engineer?

I'm going to lay out my assumptions below. Please react to anything that seems incorrect.

The idea behind data sanitization is prevent remote code execution vulnerabilities in native decoders. These vulnerabilities generally take advantage of insufficient input validation in a native decoder, so the sanitization would make sure that the browser only writes data that conforms to the expected input format.

The transcoding (should it be called re-encoding?) idea comes down to taking advantage of the fact that browsers already ship media processing libraries like libpng. So, instead of writing new sanitization code for each format to be written to the clipboard, a browser could use its existing libraries to decode the content produced by the page, then encode it again into the same format. The browser's libraries need to decode arbitrary content from the Web, so the decoders should already be secured against malicious input. Since the encoder ships with the browser, the encoder can be trusted to produce valid bytes.

So, other browsers shouldn't need to reverse-engineer anything. Chrome's transcoding / re-encoding logic is (modulo thread-hops):

const inputBytes = await ReadBytesFromBlob(inputBlob);
const decodedImage = PngDecode(inputBytes);
const safeBytes = PngEncode(decodedImage);
WriteImageToSystemClipboard(safeBytes);

Chrome currently calls into Skia for encoding/decoding images. The expectation is that other browsers would use the logic above, and have it call into their own media processing stack.

WDYT?

@torgo torgo added this to the 2019-05-08-telcon milestone Apr 17, 2019
@slightlyoff
Copy link
Member

There seem to be 2 separate questions around transcoding in my mind, and neither seem to have been answered to my satisfaction:

  1. Why do we need to do this at all given that the same risks are exposed by file download, which we also allow on arbitrary user activation?
  2. What controls (if any) does the developer have over transcode quality?

The first one seems to be the more urgent to answer. If we don't need to transcode, @annevk's concern doesn't apply (not withstanding its overstatement; we've moved out of the era of closed-source browsers; you don't need to reverse engineer anything; just read some code).

As to the second, this gets to a question of layering. A well-layered web platform feature builds on an ever-shrinking set of primitive operations. We have some image transcoding operations in <canvas> today, and the API should either be expressed in terms of them (although, again, these APIs are poorly layered and don't expose near enough developer control over encoding parameters) or, better, an off-thread-friendly, streams-based codec API whose parameters would enable developers to configure the various quality knobs and dials that are essential to make this API more than a toy for users of professional creative apps.

@dway123
Copy link

dway123 commented May 8, 2019

  1. We need transcoding to prevent remote code execution (RCE) vulnerabilities in native decoders from being exercised. File download has slightly different threats to be concerned about. For example, a user will be able to see that the file was downloaded (on the bottom downloads bar), and to see the name of the file before executing it (which is what may lead to these RCE vulnerability being exercised). In contrast, the Async Clipboard API may access data fairly silently after permission is granted, so a user may be caught off guard when pasting data outside Chrome if the data placed on the clipboard was not expected. With transcoding, this data is still guaranteed to be safe, even with user permission.

  2. The developer doesn't have controls over transcode quality, which means this has parity with the existing clipboard API (and many other web platform APIs regarding images). As transcoding doesn't reduce/affect image fidelity, this shouldn't be a concern.

Regarding layering and use with related parts of the web platform, <canvas>, <HTMLElement>, streams, DOMString, and many other types were recommended as potential ClipboardItemDataTypes. Blobs are simply just the first implemented ClipboardItem type, as they’re able to represent a large variety of different Clipboard Types, and browsers are free to implement other representations as resources allow.

Additionally, as image/png (through blob) is the first type being added, this means that, as shown in the web platform tests, the API, when manipulating image/png, can interact with other APIs supporting image/png, such as ImageBitmap. However, this API isn’t dependent on ImageBitmap, so that browsers don’t need to implement ImageBitmap (or similarly, Streams, which also only has partial support across browsers), as a prerequisite for implementing this API.

@dway123
Copy link

dway123 commented May 8, 2019

Also, to clarify, this spec/implementation should not preclude future implementations from using <canvas> or ReadableStream directly. The design and inclusion of ClipboardItemDataType is intentional, to make this API fairly flexible, and to allow future implementations to build upon it as necessary. However, we considered these alternative types out of scope of the current spec/implementation.

@cynthia
Copy link
Member

cynthia commented May 8, 2019

The story for transcoding sounds reasonable; what level of preservation should the user expect in terms of color space, metadata, and other non-raster information?

@dway123
Copy link

dway123 commented May 8, 2019

For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved. Non-raster information, therefore, is tossed out. This is consistent with the level of preservation for image metadata offered in other Clipboard APIs, and the ImageBitmap API.

However, the specification should not prevent implementers from deciding whether they choose to preserve non-raster information, as long as they can ensure it's secure.

@yoavweiss
Copy link

For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved

Just to make sure, the transcoding is then entirely lossless? Does the same apply to lossy formats, such as JPEG?

@garykac
Copy link
Author

garykac commented May 16, 2019

No, the transcoding is not lossless. It cannot be since it needs to (in some cases) cleanup encoding exploits.

For images, because the encoder/decoder that Chrome uses (Skia), and existing sanitization, can only guarantee the sanitization of raster information, only raster information is preserved.

Read this as "the pixels come through (sanitized) but the image metadata is not copied". Note that this is just a description of Chrome's current implementation. UAs can choose preserve whatever metadata from the image that they feel is appropriate.

The main takeaway from the transcoding issue is that users cannot rely on the image being identical to what they put on the clipboard. The UA may need to make changes to it to address security concerns. This applies to both the image data and to the metadata.

Ideally, UAs would preserve as much information as they can in a safe matter.

@dway123
Copy link

dway123 commented May 17, 2019

As garykac mentioned, users cannot rely on transcoding to be lossless.

It happens to be lossless for image/png's raster information, but this is only for Chrome's current implementation, and should not be expected/wouldn't preserve non-raster (metadata) anyways. For lossy formats, it may very well be lossy, or not, depending on UA implementation, but this is up to UA and there shouldn't be any expectation for lossless transcoding.

@cynthia cynthia added Progress: unreviewed Review type: later review Topic: powerful APIs APIs that reach into your life. Venue: Web Apps WG W3C Web Applications Working Group labels May 23, 2019
@slightlyoff
Copy link
Member

Sorry for the slow reply. Guess we don't have updates from the last meeting.

On the transcoding front, I don't understand how this will work for professional productivity apps. Lossy transcoding is surely going to not fly for, e.g., image editing apps. What is their recourse? A separate permission requesting lossless access? Custom datatypes? Something else?

@pwnall
Copy link

pwnall commented May 30, 2019

Thank you for your feedback! We will be shipping image support for the async clipboard API in Chrome 76, and we are confident that we can address any feedback on transcoding without changing the API shape.

We are working on a different proposal targeting professional editing applications. We expect they will need complete raw clipboard access, which includes reading/writing OS-specific data types, as well as bypassing any transcoding. We're in the very early stages of that work, and will engage with TAG once we have a better idea of what we'd like to cover.

@cynthia
Copy link
Member

cynthia commented May 30, 2019

UAs can choose preserve whatever metadata from the image that they feel is appropriate.

I think content authors would some level of consistency across implementations, leaving it completely open-ended has the risk of promoting preference of specific implementation over another - which is something we should try to avoid. (This browser does something vastly different from what we need for our tool to do what it is supposed to do, so let’s sniff and display a “please use a different UA” dialog is a mistake the web has made in the past.)

@annevk
Copy link
Member

annevk commented Jun 12, 2019

(not withstanding its overstatement; we've moved out of the era of closed-source browsers; you don't need to reverse engineer anything; just read some code)

Please remind me why we're here again?

@dway123 dway123 mentioned this issue Aug 13, 2019
5 tasks
@alice alice removed this from the 2019-12-03-f2f-cupertino milestone Jan 27, 2020
@alice alice added Progress: propose closing we think it should be closed but are waiting on some feedback or consensus Resolution: timed out The TAG has requesed additional information but has not received it and removed Progress: unreviewed Progress: propose closing we think it should be closed but are waiting on some feedback or consensus labels Mar 5, 2020
@alice
Copy link

alice commented Mar 5, 2020

Since this has shipped, and we have no major outstanding concerns, happy to close this.

Thanks for the discussion!

@alice alice closed this as completed Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: timed out The TAG has requesed additional information but has not received it Review type: later review Topic: powerful APIs APIs that reach into your life. Venue: Web Apps WG W3C Web Applications Working Group
Projects
None yet
Development

No branches or pull requests

10 participants