Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompressStream #410

Closed
3 of 5 tasks
CanonMukai opened this issue Aug 28, 2019 · 17 comments
Closed
3 of 5 tasks

CompressStream #410

CanonMukai opened this issue Aug 28, 2019 · 17 comments
Assignees
Labels
Provenance: Fugu Resolution: satisfied The TAG is satisfied with this design

Comments

@CanonMukai
Copy link

CanonMukai commented Aug 28, 2019

こんにちはTAG!

I'm requesting a TAG review of:

Further details:

We recommend the explainer to be in Markdown. On top of the usual information expected in the explainer, it is strongly recommended to add:

  • Links to major pieces of multi-stakeholder review or discussion of this specification:
  • Links to major unresolved issues or opposition with this specification:

You should also know that...

This proposal is still in the early stages.

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our GitHub repo for each point of feedback
  • open a single issue in our GitHub repo for the entire review
  • leave review feedback as a comment in this issue and @-notify [github usernames]

Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

@annevk
Copy link
Member

annevk commented Aug 28, 2019

Is there a single gzip algorithm everyone can and is willing to converge on?

@CanonMukai
Copy link
Author

As long as the compatibility (e.g. between Chromium and Firefox, or old versions and new versions of Chromium) is kept, I think it is better if CompressStream permits other gzip algorithms.

@annevk
Copy link
Member

annevk commented Aug 29, 2019

I haven't researched gzip much, but would a single implementation ever have more than a single algorithm? If the algorithm can change, how do we ensure other participants in this ecosystem remain compatible? Where are the algorithms defined?

@CanonMukai
Copy link
Author

Yes, zlib uses different algorithms depending on the compression level. Gzip is standardized and implementations extensively test against each other, so participants in this ecosystem will remain compatible. The algorithms are defined in RFC 1952 https://tools.ietf.org/html/rfc1952.

@plinss
Copy link
Member

plinss commented Sep 11, 2019

Overall this looks good to us. A few relatively minor points:

  1. Have you considered using different classes for each compression algorithm rather than a constructor argument? e.g. let s = new GzipCompressionStream(); this might make it easier to pollyfill new algorithms.

  2. If the compression algorithm remains an argument, please ensure that the value uses (and references in the spec) the HTTP Content Coding tokens: https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding

  3. It might be useful to have a property to access the compression algorithm of the stream (and any future options).

  4. It may be worth considering adding support for Brotli in V1.

@plinss plinss added Progress: propose closing we think it should be closed but are waiting on some feedback or consensus and removed Progress: unreviewed labels Sep 11, 2019
@kenchris
Copy link

There are cases where you might want to compress data but you don't have a stream. For instance in the Web NFC API people are usually dealing with small amounts of data (ArrayBuffers for instance) and it would be useful to be able to compress those in an easy manner instead of trying to turn things into a stream

@plinss
Copy link
Member

plinss commented Sep 11, 2019

There is an example of some boilerplate code to compress an ArrayBuffer into a Uint8Array, I agree having that (probably as a static method on the class) would be convenient and prevent a lot of repetition.

@cynthia
Copy link
Member

cynthia commented Sep 11, 2019

If there are more than just compression cases (e.g. I can imagine stuff like streamed crypto) where you want to throw in non-stream data, feels like that mechanism could/should be provided as a infrastructure API.

(I agree the usecase is valid and makes sense.)

@dbaron dbaron mentioned this issue Sep 12, 2019
5 tasks
@ricea
Copy link

ricea commented Oct 15, 2019

  1. Have you considered using different classes for each compression algorithm rather than a constructor argument? e.g. let s = new GzipCompressionStream(); this might make it easier to pollyfill new algorithms.

I discussed this with @CanonMukai but we preferred to use a single constructor. At least in Blink, new symbols on the global object are moderately expensive, and I wanted to avoid adding them unnecessarily.

My current thinking for polyfilling / user extensibility is to have a dictionary CompressionStream.algorithms where the keys are algorithm names and the values are factory functions. This also makes inspecting available algorithms easy.

  1. If the compression algorithm remains an argument, please ensure that the value uses (and references in the spec) the HTTP Content Coding tokens: https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding

I hadn't considered this. My one reservation is that using "br" for Brotli seems a little obscure.

Filed as whatwg/compression#7.

  1. It might be useful to have a property to access the compression algorithm of the stream (and any future options).

Yes. One thing that needs more examination is that different algorithms will have different options. For example, gzip will eventually have a filename option. It's not immediately clear to me how to handle this.

Filed as whatwg/compression#6.

  1. It may be worth considering adding support for Brotli in V1.

The awkward thing about Brotli is that to support compression we need to add a 140KB dictionary to the binary. This is a difficult trade-off and I don't have good answer for it yet. Supporting decompression is nearly free, but does it make any sense to only support decompression?

@domenic
Copy link
Member

domenic commented Oct 15, 2019

For example, gzip will eventually have a filename option. It's not immediately clear to me how to handle this.

There is a small precedent here with canvas.getContext(), whose second argument is an optional options object. This is handled on the spec level by declaring it optional any and then switching on the first argument before manually invoking the Web IDL "convert arg to Y" algorithm, where Y is the specific dictionary type in question (e.g. CanvasRenderingContext2DSettings vs. ImageBitmapRenderingContextSettings. I think that's a reasonable pattern, both for spec-writing and for web developer usage.

@annevk
Copy link
Member

annevk commented Oct 15, 2019

(If we keep using that pattern we should try to abstract it into IDL though.)

@ricea
Copy link

ricea commented Oct 15, 2019

There is an example of some boilerplate code to compress an ArrayBuffer into a Uint8Array, I agree having that (probably as a static method on the class) would be convenient and prevent a lot of repetition.

I've filed whatwg/compression#8 for this.

If there are more than just compression cases (e.g. I can imagine stuff like streamed crypto) where you want to throw in non-stream data, feels like that mechanism could/should be provided as a infrastructure API.

I filed issues against the streams standard to discuss these APIs: whatwg/streams#1018 and whatwg/streams#1019. If we implemented these, the boilerplate would reduce to

const output = await ReadableStream.from(input).pipeThrough(new CompressionStream('gzip')).arrayBuffer();

Although there's no reason why we can't do both.

@ricea
Copy link

ricea commented Oct 15, 2019

A couple of updates to the metadata:

Specification URL: https://ricea.github.io/compression/ (final location TBD)
Tests: https://github.com/web-platform-tests/wpt/tree/master/compression

@cynthia
Copy link
Member

cynthia commented Oct 15, 2019

@ricea should I update the initial post's metadata to above?

@ricea
Copy link

ricea commented Oct 15, 2019

@ricea should I update the initial post's metadata to above?

Yes, thank you.

@lknik
Copy link
Member

lknik commented Oct 21, 2019

Hi,

Looked at the security & privacy questionnaire. Thanks for listing CRIME. Otherwise for the time being looks OK to me.

@torgo torgo added this to the 2019-12-16-week milestone Dec 13, 2019
@ylafon
Copy link
Member

ylafon commented Dec 17, 2019

  1. If the compression algorithm remains an argument, please ensure that the value uses (and references in the spec) the HTTP Content Coding tokens: https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding

I hadn't considered this. My one reservation is that using "br" for Brotli seems a little obscure.

A little bit obscure but still well known, so it is always better to avoid defining another such registry.

@ylafon ylafon closed this as completed Dec 17, 2019
@ylafon ylafon removed the Progress: propose closing we think it should be closed but are waiting on some feedback or consensus label Dec 17, 2019
@ylafon ylafon added the Resolution: satisfied The TAG is satisfied with this design label Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Provenance: Fugu Resolution: satisfied The TAG is satisfied with this design
Projects
None yet
Development

No branches or pull requests