Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize pattern for exposing advanced configuration for compression streams #42820

Closed
terrajobst opened this issue Sep 28, 2020 · 50 comments · Fixed by #105430
Closed

Standardize pattern for exposing advanced configuration for compression streams #42820

terrajobst opened this issue Sep 28, 2020 · 50 comments · Fixed by #105430
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.IO.Compression
Milestone

Comments

@terrajobst
Copy link
Member

terrajobst commented Sep 28, 2020

There is a desire to expose compression-algorithm specific configuration when constructing compression streams. The same options can be used for ZLibStream, DeflateStream, GZipStream. Named those options with Zlib prefix same as the native library name they use another prefix we considered is Deflate - the algorithm name, the prefix needed because we have existing CompressionLevel enum that used for all compression streams including BrotliStream and it cannot be used for fine tuning the compression level for specific stream

public enum CompressionLevel
{
Optimal = 0,
Fastest = 1,
NoCompression = 2,
SmallestSize = 3,
}

API Proposal

namespace System.IO.Compression
{
    public enum ZlibCompressionLevel
    {
        DefaultCompression = -1,
        NoCompression = 0,
        Level1 = 1,
        BestSpeed = Level1,
        Level2 = 2,
        Level3 = 3,
        Level4 = 4,
        Level5 = 5,
        Level6 = 6,
        Level7 = 7,
        Level8 = 8,
        Level9 = 9,
        BestCompression = Level9,
    }
    public enum ZlibCompressionStrategy
    {
        DefaultStrategy = 0,
        Filtered = 1,
        HuffmanOnly = 2,
        Rle = 3,
        Fixed = 4
    }
    public enum ZlibFlushMode
    {
        NoFlush = 0,
        PartialFlush = 1,
        SyncFlush = 2,
        FullFlush = 3,
        Finish = 4,
    }
    public sealed class ZLibStream : Stream
    {
        // CompressionMode.Compress
        public ZLibStream(Stream stream, ZlibCompressionLevel compressionLevel = ZlibCompressionLevel.DefaultCompression,
            ZlibCompressionStrategy strategy = ZlibCompressionStrategy.DefaultStrategy, bool leaveOpen = false);
        public ZlibFlushMode FlushMode { get; set; }
    }
    public partial class DeflateStream : Stream
    {
        public DeflateStream(Stream stream, ZlibCompressionLevel compressionLevel = ZlibCompressionLevel.DefaultCompression,
            ZlibCompressionStrategy strategy = ZlibCompressionStrategy.DefaultStrategy, bool leaveOpen = false);
        public ZlibFlushMode FlushMode { get; set; }
     }
     public partial class GZipStream : Stream
    {
        public GZipStream(Stream stream, ZlibCompressionLevel compressionLevel = ZlibCompressionLevel.DefaultCompression,
            ZlibCompressionStrategy strategy = ZlibCompressionStrategy.DefaultStrategy, bool leaveOpen = false) { }
        public ZlibFlushMode FlushMode { get; set; }
    }
    public enum BrotliCompressionQuality
    {
        NoCompression,
        Quality1,
        Quality2,
        Quality3,
        Quality4,
        Quality5,
        Quality6,
        Quality7,
        Quality8,
        Quality9,
        Quality10,
        Quality11
    }
    public sealed partial class BrotliStream : System.IO.Stream
    {
        public BrotliStream(Stream stream, BrotliCompressionQuality quality = BrotliCompressionQuality.Quality4, bool leaveOpen = false) { }
    }
}

ZlibCompressionLevel, ZlibCompressionStrategy are only for Compress mode.

Currently ZlibFlushMode.NoFlush used in normal Read/Write operations, ZlibFlushMode.SyncFlush used for Stream.Flush and ZlibFlushMode.Finish is used on dispose. The value set by the new FlushMode property will be used for normal Read/Write operations only.

The MemoryLevel and/or WindowBits options are omitted because there is no ask for them, we could add these and other options if/when they are requested.

API Usage

private MemoryStream CompressStream(Stream uncompressedStream)
{
    var compressorOutput = new MemoryStream();
    using (var compressionStream = new ZLibStream(compressorOutput, compressionLevel: ZlibCompressionLevel.Level5, strategy: ZlibCompressionStrategy.Filtered, leaveOpen: true))
    {
        compressionStream.FlushMode = ZlibFlushMode.NoFlush;
        var buffer = new byte[4096];
        int bytesRead;
        while ((bytesRead = uncompressedStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            compressionStream.Write(buffer, 0, bytesRead);
        }
    }

    compressorOutput.Position = 0;
    return compressorOutput;
}

Alternative Designs

An alternative design would introduce an options type, maybe per each stream, like so:

namespace System.IO.Compression
{
    public class BrotliOptions
    {
        public readonly BrotliCompressionQualityQuality { get; }
        public readonly int WindowBits { get; }
        public BrotliOptions(BrotliCompressionQuality quality = BrotliCompressionQuality.Level4, int windowBits = 22) { }
    }
    public sealed class BrotliStream : Stream
    {
        public BrotliStream(Stream stream, BrotliOptions options, bool leaveOpen = false);
    }
    public struct ZLibOptions
    {
        public int WindowBits { get; }
        public ZlibMemoryLevel MemoryLevel { get; }
        public int CompressionLevel { get; }
        public int CompressionStrategy { get; }
        public ZlibOptions(ZlibCompressionLevel compressionLevel = ZlibCompressionLevel.DefaultCompression, 
             ZlibCompressionStrategy strategy = ZlibCompressionStrategy.DefaultStrategy,
             ZlibMemoryLevel memoryLevel = ZlibMemoryLevel.Level8, int windowBits = -1);
    }
    public sealed class ZLibStream : Stream
    {
        public ZLibStream(Stream stream, ZLibOptions options, bool leaveOpen = false);
    }
    public class DeflateStream : Stream
    {
        public DeflateStream(Stream stream, ZLibOptions options, bool leaveOpen = false);
    }
    public class GZipStream : Stream
    {
        public GZipStream(Stream stream, ZLibOptionsoptions, bool leaveOpen = false);
    }
}

We can then decide whether we want to be in the business of defining enums for the underlying types or whether we consider them passthru.

@terrajobst terrajobst added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO.Compression labels Sep 28, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Sep 28, 2020
@stephentoub stephentoub removed the untriaged New issue has not been triaged by the area owner label Sep 29, 2020
@stephentoub stephentoub added this to the 6.0.0 milestone Sep 29, 2020
@FiniteReality
Copy link

I personally like the Options approach; it's easier to pass around as part of configuration options, and IMO makes refactoring (e.g. to change the compression algorithm being used) easier if that becomes an issue.

@JimBobSquarePants
Copy link

The Options approach doesn't allow for setting the ZlibFlushMode property for each read/write. That was a requirement identified in the previous thread.

@FiniteReality
Copy link

The Options approach doesn't allow for setting the ZlibFlushMode property for each read/write. That was a requirement identified in the previous thread.

This isn't necessarily true:

class ZLibOptions
{
    public static readonly int FlushFinish = 4;
    public int FlushMode { get; set; }
}

class ZLibStream : Stream
{
    public ZLibStream(Stream stream, ZLibOptions options, bool leaveOpen = false);
    public ZLibOptions Options { get; set; }
}
// ...
using var stream = new ZLibStream(myStream, options);
stream.Options.FlushMode = ZLibOptions.FlushFinish;
stream.Write(...);

However, couldn't the existing Flush API be augmented to allow you to specify the flush mode? I imagine, by default, you'd want Z_NO_FLUSH and then would want to specify something special in a separate step. I'm not sure if ZLib supports that sort of behaviour though

@JimBobSquarePants
Copy link

Ah yeah, no. Deflate.Flush is a separate operation from Stream.Flush.

@FyiurAmron
Copy link
Contributor

FyiurAmron commented Dec 4, 2020

FWIW, I'd say that current abstraction is broken (both in terms of correspondence with e.g. actual zlib functions, of POLA, and in simplicity of use) and adding a better abstraction would be a better idea. #2236 (comment) sums up my stance on this. As such, I'd say that providing an access to more low-level API would solve all the issues raised here (and in #2236, #16923, and probably some other places too).

What am I talking about precisely? Look at actual zlib. We of course can skip both the init/end as separate calls, but still: both deflate() and inflate() calls take z_stream_s & flush as param, which can be roughly translated to 3 OOP params: input stream, output stream, and call options. While some options are arguably useless in C# (alloc fn, dealloc fn), the architecture would be still the same. zlib provides two explicit methods for this. Using a single class for two different (albeit related) functionalities is akin to having a serializer being able to deserialize etc. In this case, the semantics are outright ugly:

DeflateStream compressStream = new( srcStream, CompressionLevel.Optimal );
DeflateStream decompressStream = new( srcStream, CompressionMode.Decompress );

ZlibStream compressStream = new( srcStream, CompressionLevel.Optimal );
ZlibStream decompressStream = new( srcStream, CompressionMode.Decompress );

This is neither intuitive or useful in any way. A proper way would be to have either (ZlibOptions would extend generic Options common for all (de)compressors):

ZlibDeflateStream compressStream = new( srcStream, zlibOptions );
ZlibInflateStream decompressStream = new( srcStream, zlibOptions );

or, if for some reason we want to have them bundled into a single class (I would advocate against it, but it would be still better than current approach):

ZlibStream compressStream = new( srcStream, Mode.Compress, zlibOptions );
ZlibStream decompressStream = new( srcStream, Mode.Decompress, zlibOptions );

Using an option pattern for flushing would be IMVHO the best option (sic!), since we can just set flush option as read-write (as @FiniteReality stated). I would keep the other options, especially those used in init or related to it (mode [compress/decompress], compression level, compression strategy, data type etc.) read-only. This way, we can have the best of all - access to low-level operations, easy backwards-compatible extensibility, clear abstractions. Switching the used headers etc. would be trivial this way.

e.g. https://github.com/Elskom/zlib.managed/tree/master/zlib.managed does this in the most straightforward way.

tl;dr please, for the sake of humanity and future generations, use options instead of wacky constructors.

@rog1039
Copy link

rog1039 commented Jan 6, 2021

I've landed here because the current API for managing Brotli compression only gives me:

public enum CompressionLevel
{
        Optimal = 0,
        Fastest = 1,
        NoCompression = 2
}

In one sample, optimal compressed to 700K in 53sec and fastest to 4,400K in 2.8sec. I'd like to find a middle ground between these two extremes for our application.

I'm not too hung up on how we end up tweaking the various knobs but do think it is important to have access to all (or near all) the knobs for a given compression algorithm.

From a cursory reading of this and related threads, I get the feeling that a dotnet-wide compression enum is an abstraction that is trying to cover too many use cases. The options style approach seems fine to me.

UPDATE
I just went through a blog post that suggested I do

Level = (CompressionLevel)5;

as a way that may affect the compression level. This did work but it was not clear how or why this would do what I wanted. It was only after going through dotnet source code that it made any sense. First to the BrotliStream:
image

and then into
image

So, in summary, it would have saved a lot of time if we simply exposed algorithm specific options to the programmer. The current behavior is unintuitive and limiting. A new options-based configuration API sounds expressive and intuitive.

@terrajobst
Copy link
Member Author

terrajobst commented May 11, 2021

@carlossanlop any objections marking this as api-ready-for-review? We can discuss this in the meeting if we want to do this at all.

Never mind; I just realized that I don't have a design proposal, just a vague direction to go after 😄

@AraHaan
Copy link
Member

AraHaan commented Oct 10, 2021

I think the best option is to overhaul System.IO.Compression and deprecate the current classes and enums inside for better designed compression types, or at least change them but to carefully do it to not break someome badly. This is why I think the best option is to deprecate them and start over on them. I will see what I can do on my end to fix a regression in my zlib that I did not have tests to catch and then propose them to replace the current zlib apis for .NET 7, I might also at the same time work on the other types in System.IO.Compression as well and work them all in a single proposal that could go into this issue. However it might take a few months to work on.

@rog1039
Copy link

rog1039 commented Oct 14, 2021

I think a new Api could be worthwhile. Just a few minutes ago I ran into another problem with the current API design. I upgraded a WebApi project to .Net 6 and some of my endpoint response times dropped from 200ms to 3seconds. I was very confused why my response time was 15x longer than before. Turns out it was due to the design of the compression API's.

I used

services.AddResponseCompression(options =>
            {
                options.MimeTypes = CompressedMimeTypes;
                options.Providers.Add(
                    new BrotliCompressionProvider(new BrotliCompressionProviderOptions()
                    {
                        Level = (CompressionLevel) 3
                    }));
            });

in Net5 to get a brotli compression level of 3 which provided a good combination of compression size vs compression time.

However, in Net6, the CompressionLevel enum was expanded to include SmallestSize which is now enum value 3:

CompressionLevel
  {
    Optimal,
    Fastest,
    NoCompression,
    SmallestSize,
  }

Now, my code instead of using Brotli compression level 3 was using level 11!

Why was that? Because of this code:

 internal static int GetQualityFromCompressionLevel(CompressionLevel level) =>
            level switch
            {
                CompressionLevel.Optimal => Quality_Default,
                CompressionLevel.NoCompression => Quality_Min,
                CompressionLevel.Fastest => 1,
                CompressionLevel.SmallestSize => Quality_Max,
                _ => (int)level,
            };

The CompressionLevel enum has 4 members now so Brotli compression levels 1, 2, and 3 are not even usable because they map to enum members. Fortunately, brotli compression level 4 has compression and runtimes similar to level 3 so it did not matter in our case much.

The more I think about this issue, the more I lean towards a new API for compression. So another +1 for a new options-based or some other new API where compression API's are explicit and understandable rather than the confusing situation we have today.

@rog1039
Copy link

rog1039 commented Oct 14, 2021

Further, the choice of enum conversions in BrotliUtils.cs is strange. Optimal compression and SmallestSize both map to the highest compression, value 11.

Smallest size makes sense as 11, however, Optimal should not IMHO map to 11. In my application, on a roughly 1.5MB payload, compression level 3 reduces the payload by 95% and takes 200ms. Compression level 11 reduces the payload by 97% and takes 3,000ms. Doesn't seem like it is worth taking 15 times as long for an additional 2% reduction in size. Optimal in BrotliUtils.cs should map to something else like level 3. Better still of course is a new API, but this seems like a quick 1-line change that would make Brotli easier to use by avoiding the probably never used compression level 11.

    internal static partial class BrotliUtils
    {
        public const int WindowBits_Min = 10;
        public const int WindowBits_Default = 22;
        public const int WindowBits_Max = 24;
        public const int Quality_Min = 0;
        public const int Quality_Default = 11; //In my opinion, Quality_Default should be 3 or something similar, not 11.
        public const int Quality_Max = 11;
        public const int MaxInputSize = int.MaxValue - 515; // 515 is the max compressed extra bytes

        internal static int GetQualityFromCompressionLevel(CompressionLevel level) =>
            level switch
            {
                CompressionLevel.Optimal => Quality_Default,
                CompressionLevel.NoCompression => Quality_Min,
                CompressionLevel.Fastest => 1,
                CompressionLevel.SmallestSize => Quality_Max,
                _ => (int)level,
            };
    }

Also, just looked at the source for CompressionLevel:

/// <summary>
/// The compression operation should balance compression speed and output size.
/// </summary>
Optimal = 0,

So Quality_Default = 11 seems to be the exact opposite of the purpose of the Optimal member in the enum.

@AraHaan
Copy link
Member

AraHaan commented Oct 14, 2021

I agree I think either an entire redesign of the apis is needed or an public ctor overload accepting int's so we can bypass the range check (and the badly designed enum that is reused for all of the compression stream classes). In zlib if I wanted to use compression level 9 by casting the 9 to the enum with DeflateStream/ZlibStream (.NET 6) then it will throw an out of range exception instead of just working out of the box and using the native (and supported) compression level 9, but that is an example with 9, some people might want the 8, 7, 5, 4, 3, 2 compression levels on zlib and not 9 that is not covered by the enum.

Alternatively I would say that the enum should be obsoleted and the ctors accepting them and recommending them to move to the ones accepting ints and have their own range checking as well between the minimal compression levels, and the maximum one and if not in between them throw in the ctor. Which would probably be even better to do than needing to maintain an poorly designed enum that is reused and not even enough for what most developer's needs are. We have to consider these:

  • use ints for compression level input and range check
  • add x new enums for each compression stream that would contain all of the compression levels (public) of what the native implementation supports.
  • allow to configure the flush strategy on the streams (expose an FlushStategy property for supported streams like zlib)

I think that bullet 1 and 3 are the best options, it will expose new properties and overloads of the ctors accepting int's instead of polluting the namespace with x more enums that people then have to look up the documentations for. Which would make it a lot easier to migrate to the ctors of the streams accepting an int compression level.

@AraHaan
Copy link
Member

AraHaan commented Nov 28, 2021

API Proposal

namespace System.IO.Compression
{
+    public enum ZlibCompressionLevel
+    {
+        DefaultCompression = -1,
+        NoCompression,
+        Level1,
+        BestSpeed = Level1,
+        Level2,
+        Level3,
+        Level4,
+        Level5,
+        Level6,
+        Level7,
+        Level8,
+        Level9,
+        BestCompression = Level9
+    }
+    public enum ZlibCompressionStrategy
+    {
+        DefaultStrategy,
+        Filtered,
+        HuffmanOnly,
+        Rle,
+        Fixed
+    }
+    public enum ZlibFlushCode
+    {
+        NoFlush,
+        PartialFlush,
+        SyncFlush,
+        FullFlush,
+        Finish,
+        Block,
+        Trees
+    }
+    public enum ZlibMemoryLevel
+    {
+        Level1 = 1,
+        Level2,
+        Level3,
+        Level4,
+        Level5,
+        Level6,
+        Level7,
+        Level8,
+        Level9
+    }
+    public class ZlibOptions
+    {
+        public int WindowBits { get { throw null; } }
+        public ZlibMemoryLevel MemoryLevel { get { throw null; } }
+        public ZlibFlushCode FlushMode { get { throw null; } set { throw null; } }
+        public ZlibCompressionLevel CompressionLevel { get { throw null; } }
+        public ZlibCompressionStrategy Strategy { get { throw null; } }
+        public CompressionMode CompressionMode { get { throw null; } }
+        public ZlibOptions() { } // Do not override this in subclasses (hopefully those compiler generated ones will be generated to call this base one).
+        public ZlibOptions(CompressionMode compressionMode) { } // override this one in subclasses.
+        public ZlibOptions(int windowBits, ZlibMemoryLevel memoryLevel, ZlibCompressionStrategy strategy, CompressionMode compressionMode) { }
+        public ZlibOptions(ZlibCompressionLevel compressionLevel) { }
+    }
+    public class DeflateOptions : ZlibOptions
+    {
+        public DeflateOptions(CompressionMode compressionMode) { }
+    }
+    public class GZipOptions : ZlibOptions
+    {
+        public GZipOptions(CompressionMode compressionMode) { }
+    }
+    public enum BrotliCompressionLevel
+    {
+        NoCompression,
+        Level1,
+        Level2,
+        Level3,
+        Level4,
+        Level5,
+        Level6,
+        Level7,
+        Level8,
+        Level9,
+        Level10,
+        Level11
+    }
+    public struct BrotliOptions
+    {
+        public readonly BrotliCompressionLevel CompressionLevel { get { throw null; } }
+        public readonly int WindowBits { get { throw null; } }
+        public readonly CompressionMode CompressionMode { get { throw null; } }
+        public BrotliOptions() { }
+        public BrotliOptions(BrotliCompressionLevel compressionLevel) { }
+        public BrotliOptions(BrotliCompressionLevel compressionLevel, int windowBits) { }
+    }
    public partial struct BrotliEncoder : System.IDisposable
    {
+        public BrotliEncoder(BrotliOptions options) { throw null; }
+        public static bool TryCompress(System.ReadOnlySpan<byte> source, System.Span<byte> destination, out int bytesWritten, BrotliOptions options) { throw null; }
    }
    public sealed partial class BrotliStream : System.IO.Stream
    {
+        public BrotliStream(Stream stream, BrotliOptions options) { }
+        public BrotliStream(Stream stream, BrotliOptions options, bool leaveOpen) { }
    }

Usage

For usage see the proposal: #62113.

Risk

Minimal

Tests

Will have to be added to the PR that will fix #62113. Likewise due to the tests needing to be added there the implementation might just have go in that same PR.

Details

This is due to needing the Encoder/Decoder added in that PR to then finish off this issue, to where the options classes can be used in the stream compression api classes (with a public constructor added that can accept an options instance).

Why do subclasses of ZlibOptions?

I feel like the subclass model to doing custom configurations of the windowbits, memory level, and strategy is the way to go (it will allow developers to quickly pick what they want with little effort).

@stephentoub
Copy link
Member

we need to consider if we should mark those as obsolete so people can migrate to span/memory based compression apis that are encoder/decoder like and feel more natural.

No, we will not be obsoleting/deprecating the Stream-based compression APIs.

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 12, 2024

@JimBobSquarePants @AraHaan and whoever interested in these APIs, do you need the MemoryLevel and/or windowBits options exposed for ZLib? Also, is anyone needs the windowBits option exposed for Brotli? I looked around a bit here and other related issues/PRs don't see anyone mentioned that they need these options, so I would like to remove those options from the proposal.

@JimBobSquarePants
Copy link

JimBobSquarePants commented Jul 12, 2024

@buyaa-n I don’t think I would ever have a need for them and I don’t think many people would. Compression levels and strategy are commonly used features.

@masonwheeler
Copy link
Contributor

Shouldn't this breaking change be reversed?

No. It was a bug it was allowed in the first place.

"Bug" or not, it's still a breaking change because people wrote code that depends on it, that code worked as expected, and it no longer works in the updated version. That's a real problem, one that Microsoft used to understand well.

@stephentoub
Copy link
Member

it's still a breaking change

And was documented as such.
https://learn.microsoft.com/en-us/dotnet/core/compatibility/core-libraries/7.0/brotlistream-ctor

@AraHaan
Copy link
Member

AraHaan commented Jul 15, 2024

@buyaa-n I don’t think I would ever have a need for them and I don’t think many people would. Compression levels and strategy are commonly used features.

While that may be true, there are times where people want to make software using custom window bits so having a public api that supports providing it is a free win for them.

@AraHaan
Copy link
Member

AraHaan commented Jul 15, 2024

And here is an image explaining why to expose window bits at least for zlib:

image

In my ZlibSharp package, I exposed it because it is pretty common for an application to want to compress items using Deflate, zlib-streams, and gzip streams. An example software that comes in mind is https://www.github.com/python/cpython/ because their built-in standard library has a hard requirement to support all 3 types. Also my ZlibSharp package I have finally decided to apply my zlib encoder and zlib decoder logic to it to give people a preview of that and the type of ABI surface I want for when I introduce it as System.IO.Compression.Zlib which would be a better designed implementation.

@stephentoub
Copy link
Member

it is pretty common for an application to want to compress items using Deflate, zlib-streams, and gzip streams

For which DeflateStream, ZLibStream, and GZipStream exist.

@AraHaan
Copy link
Member

AraHaan commented Jul 15, 2024

it is pretty common for an application to want to compress items using Deflate, zlib-streams, and gzip streams

For which DeflateStream, ZLibStream, and GZipStream exist.

I sorta made something that can replace those streams via ZlibEncoder and ZlibDecoder and will arrive in my ZlipSharp preview package that can be tested as a proof of concept for a new System.IO.Compression.Zlib assembly in the runtime.

Version 1.2.13.4 though as I have yet to update the native zlip for it to 1.3.1.
Edit: Version is now live on nuget to test out.

Things to note: For the upcoming .NET version the code could use a new Error member (for generic errors) in System.Buffers.OperationStatus as it does not apply to zlib's Z_VERSION_ERROR and Z_ERRNO error codes that can be returned from the api's that does not fall into the existing error types in that enum for the System.IO.Compression.Zlib assembly though.

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 16, 2024

it is pretty common for an application to want to compress items using Deflate, zlib-streams, and gzip streams

For which DeflateStream, ZLibStream, and GZipStream exist.

I sorta made something that can replace those streams via ZlibEncoder and ZlibDecoder and will arrive in my ZlipSharp preview package that can be tested as a proof of concept for a new System.IO.Compression.Zlib assembly in the runtime.

That doesn't sound like a reason to add those for runtime. We can add that later, when it's really needed. I removed them from the API proposal.

@AraHaan
Copy link
Member

AraHaan commented Jul 16, 2024

it is pretty common for an application to want to compress items using Deflate, zlib-streams, and gzip streams

For which DeflateStream, ZLibStream, and GZipStream exist.

I sorta made something that can replace those streams via ZlibEncoder and ZlibDecoder and will arrive in my ZlipSharp preview package that can be tested as a proof of concept for a new System.IO.Compression.Zlib assembly in the runtime.

That doesn't sound like a reason to add those for runtime. We can add that later, when it's really needed. I removed them from the API proposal.

I disagree, having only stream based API's is counter productive if they are in hot path code as it would reduce performance. Span based on the other hand would increase performance of those hot paths and can make a difference in places. There was a discussion about this in a particular discord server about that. Also I have found that in a majority of cases when compressing or decompressing data people tend to not need streams and only have use for them for the existing compression streams to zlib and that POC package avoids them while allowing those people to still decompress/compress their data.

@stephentoub
Copy link
Member

That doesn't sound like a reason to add those for runtime. We can add that later, when it's really needed. I removed them from the API proposal.

I disagree, having only stream based API's is counter productive

The two of you are talking about different things. @buyaa-n is taking about Stream ctors that take the additional settings. @AraHaan is talking about non-Stream-based encoder/decoders. This issue is about Stream ctors.

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 16, 2024

That doesn't sound like a reason to add those for runtime. We can add that later, when it's really needed. I removed them from the API proposal.

I disagree, having only stream based API's is counter productive

The two of you are talking about different things. @buyaa-n is taking about Stream ctors that take the additional settings. @AraHaan is talking about non-Stream-based encoder/decoders. This issue is about Stream ctors.

Right, sorry for the confusion, I am talking about adding windowBits option for DeflateStream, ZLibStream, and GZipStream (and Brotli). Don't see its needed for these streams now, and we can add it later when needed.

@bartonjs
Copy link
Member

bartonjs commented Jul 23, 2024

Video

  • ZlibCompressionStrategy => ZLibCompressionStrategy (consistent casing with ZLibStream)
  • ZlibCompressionStrategy.Rle => RunLengthEncoding
  • ZlibCompressionStrategy.DefaultStrategy => Default
  • ZLibFlushMode is cut pending a scenario
  • ZLibCompressionLevel is cut, just using the raw integer.
  • Created ZLibCompressionOptions to allow for expanding to more options in the future
  • ZLibCompressionOptions is a required parameter, consistent with the existing overloads.
  • BrotliCompressionQuality => int property on a new class
namespace System.IO.Compression
{
    public enum ZLibCompressionStrategy
    {
        Default = 0,
        Filtered = 1,
        HuffmanOnly = 2,
        RunLengthEncoding = 3,
        Fixed = 4,
    }
    public sealed class ZLibCompressionOptions
    {
        public int CompressionLevel { get; set; }
        public ZLibCompressionStrategy CompressionStrategy { get; set; }
    }
    public sealed class ZLibStream : Stream
    {
        // CompressionMode.Compress
        public ZLibStream(Stream stream, ZLibCompressionOptions compressionOptions, bool leaveOpen = false);
    }
    public partial class DeflateStream : Stream
    {
        public DeflateStream(Stream stream, ZLibCompressionOptions compressionOptions, bool leaveOpen = false);
    }
    public partial class GZipStream : Stream
    {
        public GZipStream(Stream stream, ZLibCompressionOptions compressionOptions, bool leaveOpen = false);
    }


    public sealed class BrotliCompressionOptions
    {
        public int Quality { get; set; }
    }
    public sealed partial class BrotliStream : System.IO.Stream
    {
        public BrotliStream(Stream stream, BrotliCompressionOptions compressionOptions, bool leaveOpen = false) { }
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jul 23, 2024
@Mrgaton
Copy link

Mrgaton commented Jul 23, 2024

What about dictionary size or window length

@Mrgaton
Copy link

Mrgaton commented Jul 23, 2024

Compresión level maybe would be better with an enum to know which options are available, same as brotli quality

@Mrgaton
Copy link

Mrgaton commented Jul 23, 2024

I think node js can compress brotli with multiple threads, could this be possible on c# too?

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 29, 2024

What about dictionary size or window length

If you need them, please open a new issue (API poposal) with your user scenario, we are open to add more options to the new ***CompressionOptions types

@Mrgaton
Copy link

Mrgaton commented Jul 29, 2024

I tried opening them and I couldn't do it If you can do it, please

I would like to be able to increase dictionary length and set custom brotli mode, also deflate doesn't got any custom compression options?

@stephentoub
Copy link
Member

I tried opening them and I couldn't do it If you can do it, please

What prevented you from doing so?

@Mrgaton
Copy link

Mrgaton commented Jul 30, 2024

I didn't have permission or something it gives me an error

@masonwheeler
Copy link
Contributor

If you could copy/paste the exact text of the error here, that would be extremely helpful.

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 30, 2024

I didn't have permission or something it gives me an error

No permission needed for creating an issue

@Mrgaton
Copy link

Mrgaton commented Jul 30, 2024

Yes I tried and on PC and let me do it, sorry.

@github-actions github-actions bot locked and limited conversation to collaborators Aug 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.IO.Compression
Projects
None yet
Development

Successfully merging a pull request may close this issue.