Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in System.IO.Compression.Inflater or the native zlib dependency - invalid distance too far back. #50235

Closed
Tracked by #93172 ...
zlatanov opened this issue Mar 25, 2021 · 20 comments · Fixed by #51857
Closed
Tracked by #93172 ...

Comments

@zlatanov
Copy link
Contributor

zlatanov commented Mar 25, 2021

This is related to the work I am doing to enable compression in WebSockets (here). I have written several tests that make sure the websocket works well with all supported window sizes. To test edge cases I also have created data, which when compressed results in bigger payload.

All tests work fine, except one and I isolated the issue to the following reproducible code:

using System;
using System.Buffers;
using System.IO.Compression;
using System.Reflection;

class Program
{
    static void Main()
    {
        const BindingFlags flags = BindingFlags.Instance | BindingFlags.NonPublic;

        // The data file is 65535 bytes, compressed (raw deflate) to 67022 (bigger).
        var dataStream = typeof(Program).Assembly.GetManifestResourceStream("Data.deflate");
        var constructor = typeof(DeflateStream).GetConstructor(flags, binder: null,
            types: new Type[] { typeof(Stream), typeof(CompressionMode), typeof(bool), typeof(int), typeof(long) }, modifiers: null);
        using var deflate = (DeflateStream)constructor.Invoke(new object[] { 
            dataStream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-10, /*uncompressedSize*/-1L });

        typeof(DeflateStream).GetField("_buffer", flags).SetValue(deflate, ArrayPool<byte>.Shared.Rent(512));

        var buffer = new byte[512];
        var count = 0;

        while (true)
        {
            var byteCount = deflate.Read(buffer);
            count += byteCount;

            if (count == ushort.MaxValue)
            {
                break;
            }
        }
    }
}

Ignore the reflection, in System.Net.WebSockets we have access to the Interop.zlib.cs, but it would not be easy to write code that shows the problem without a lot of boilerplate stuff.

The code in the example will throw an exception, and the native zlib error will be "invalid distance too far back". If we change the size of the _buffer to 2048, the error disappears and the data is inflated without a problem.

The default buffer size for DeflateStream is 8KB and I so far have been unable to create data that would cause the same error, but nevertheless I think there is a bug somewhere.

In the WebSocket right now I have dynamic buffer size, depending on how big the user buffer is. I can easily use 8KB as minimum, but I think this would only hide the problem or make it rarer.

//cc @CarnaViire, @carlossanlop, @stephentoub

ConsoleApp.zip

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 25, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@stephentoub
Copy link
Member

stephentoub commented Mar 25, 2021

@zlatanov, I tried your repro on both .NET 5 and .NET 6 recent bits, and both ran successfully without exception for me on Windows.

What version are you testing against? On what OS? I can try on Linux as well.

@stephentoub
Copy link
Member

I just tried as well on Ubuntu and got "System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method."

@zlatanov
Copy link
Contributor Author

zlatanov commented Mar 25, 2021

@stephentoub I'm sorry, I attached the working version of the code. On line 19, change the rent'ed buffer size to be 512 or 1024 instead of 2048 please. I will reattach the project.

image

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Mar 25, 2021
@carlossanlop carlossanlop added this to the Future milestone Mar 25, 2021
@carlossanlop carlossanlop added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 25, 2021
@stephentoub
Copy link
Member

To test edge cases I also have created data, which when compressed results in bigger payload.

How did you create this data file?

@zlatanov
Copy link
Contributor Author

Using Random class and deflating the data in segments. However here is the same code without the particular data file, using Random and only DeflateStream to first compress and then decompress.

using System;
using System.Buffers;
using System.IO;
using System.IO.Compression;
using System.Reflection;

class Program
{
    static void Main()
    {
        const BindingFlags flags = BindingFlags.Instance | BindingFlags.NonPublic;
        var constructor = typeof(DeflateStream).GetConstructor(flags, binder: null,
            types: new Type[] { typeof(Stream), typeof(CompressionMode), typeof(bool), typeof(int), typeof(long) }, modifiers: null);

        var stream = new MemoryStream();

        using (var inflator = (DeflateStream)constructor.Invoke(new object[] {
            stream, CompressionMode.Compress, /*leaveOpen*/true, /*windowBits*/-10, /*uncompressedSize*/-1L }))
        {
            var data = new byte[ushort.MaxValue];
            new Random(0).NextBytes(data);

            inflator.Write(data);
        }

        stream.Position = 0;

        using (var deflator = (DeflateStream)constructor.Invoke(new object[] {
            stream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-10, /*uncompressedSize*/-1L }))
        {
            typeof(DeflateStream).GetField("_buffer", flags).SetValue(deflator, ArrayPool<byte>.Shared.Rent(512));

            var buffer = new byte[512];
            var count = 0;

            while (true)
            {
                var byteCount = deflator.Read(buffer);
                count += byteCount;

                if (count == ushort.MaxValue)
                {
                    break;
                }
            }
        }
    }
}

@CarnaViire
Copy link
Member

@carlossanlop @adamsitnik @jozkee
This is now blocking on large community PR (#49304) and we would love to help... Could you please take a look at this?

The same error also appeared several times during WebSocket compression testing with https://github.com/crossbario/autobahn-testsuite
@zlatanov could you please provide more information on the new occurrences?

@zlatanov
Copy link
Contributor Author

I have created https://github.com/zlatanov/websockets-compliance repository with instructions how to run the testsuite. To demonstrate the issue we can look only at one of the cases which is pretty simple:

Case 13.3.1 (id: 428)
Send 1000 compressed messages each of payload size 16, auto-fragment to 0 octets. Use permessage-deflate client offers (requestNoContextTakeover, requestMaxWindowBits): [(False, 9)]

Running the project in the aforementioned repository as follows: CoreRun.exe "path\to\bin\release\net5.0\WebSocketCompliance.dll" 428 will result in the following error in the autobahn code:
image

Not all cases have this problem, only some of them, and only those which use 9 window bits.

I cannot confidently say that this isn't an issue with the way I use the deflater, it's just that I can't find the reason for it.

@zlatanov
Copy link
Contributor Author

Here is a minimal repro with data captured from the testsuite for test case 428.

using System;
using System.Collections.Generic;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Reflection;

class Program
{
    static readonly ConstructorInfo Constructor = typeof(DeflateStream).GetConstructor(BindingFlags.Instance | BindingFlags.NonPublic, binder: null,
        types: new Type[] { typeof(Stream), typeof(CompressionMode), typeof(bool), typeof(int), typeof(long) }, modifiers: null);

    static void Main()
    {
        Working();
        Failing();
    }

    static void Working()
    {
        var memoryStream = new MemoryStream(Compress().SelectMany(x => x).ToArray());
        var deflate = (DeflateStream)Constructor.Invoke(new object[] {
                memoryStream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-9, /*uncompressedSize*/-1L });

        var buffer = new byte[1024];

        while (true)
        {
            var bytesRead = deflate.Read(buffer);
            if (bytesRead == 0)
            {
                break;
            }
        }

        Console.WriteLine("Works");
    }

    static void Failing()
    {
        var memoryStream = new MemoryStream();
        var deflate = (DeflateStream)Constructor.Invoke(new object[] {
                memoryStream, CompressionMode.Decompress, /*leaveOpen*/false, /*windowBits*/-9, /*uncompressedSize*/-1L });

        var buffer = new byte[1024];

        foreach (var segment in Compress())
        {
            memoryStream.Write(segment);
            memoryStream.Position -= segment.Length;

            deflate.Read(buffer);
        }

        Console.WriteLine("Should not see this...");
    }

    static List<byte[]> Compress()
    {
        var stream = new MemoryStream();
        var result = new List<byte[]>();

        using var inflator = (DeflateStream)Constructor.Invoke(new object[] {
                stream, CompressionMode.Compress, /*leaveOpen*/true, /*windowBits*/-9, /*uncompressedSize*/-1L });

        var reader = new StringReader(Data);
        var line = reader.ReadLine();

        while (line is not null)
        {
            inflator.Write(Convert.FromHexString(line));
            inflator.Flush();

            result.Add(stream.ToArray());
            stream.SetLength(0);

            line = reader.ReadLine();
        }

        return result;
    }

    const string Data =
@"7B0A202020224175746F6261686E5079
74686F6E2F302E362E30223A207B0A20
202020202022312E312E31223A207B0A
20202020202020202022626568617669
6F72223A20224F4B222C0A2020202020
20202020226265686176696F72436C6F
7365223A20224F4B222C0A2020202020
20202020226475726174696F6E223A20
322C0A2020202020202020202272656D
6F7465436C6F7365436F6465223A2031
3030302C0A2020202020202020202272
65706F727466696C65223A2022617574
6F6261686E707974686F6E5F305F365F
305F636173655F315F315F312E6A736F
6E220A2020202020207D2C0A20202020
202022312E312E32223A207B0A202020
202020202020226265686176696F7222
3A20224F4B222C0A2020202020202020
20226265686176696F72436C6F736522
3A20224F4B222C0A2020202020202020
20226475726174696F6E223A20322C0A
2020202020202020202272656D6F7465
436C6F7365436F6465223A2031303030
2C0A202020202020202020227265706F
727466696C65223A20226175746F6261
686E707974686F6E5F305F365F305F63
6173655F315F315F322E6A736F6E220A
2020202020207D2C0A20202020202022
312E312E33223A207B0A202020202020
202020226265686176696F72223A2022
4F4B222C0A2020202020202020202262
65686176696F72436C6F7365223A2022
4F4B222C0A2020202020202020202264
75726174696F6E223A20322C0A202020
2020202020202272656D6F7465436C6F
7365436F6465223A20313030302C0A20
2020202020202020227265706F727466
696C65223A20226175746F6261686E70
7974686F6E5F305F365F305F63617365
5F315F315F332E6A736F6E220A202020
2020207D2C0A20202020202022312E31
2E34223A207B0A202020202020202020
226265686176696F72223A20224F4B22
2C0A2020202020202020202262656861
76696F72436C6F7365223A20224F4B22
2C0A2020202020202020202264757261
74696F6E223A20322C0A202020202020";
}

The difference between the working and the failing method is that in the first I am reading everything at once. In the second I am reading line by line.

@zlatanov
Copy link
Contributor Author

I can confirm this only happens with the intel version of the zlib library. I've attached a c++ project that replicates the test case above. I've included in the project the source files from src\libraries\Native\Windows\System.IO.Compression.Native.

I've added a project property that enables to build with either one of the versions - <UseIntel>false</UseIntel>. You can change that to true / false and rebuild the project and run. Using the original version of zlib, you should see the following output:
image

However, if we change the project to use the Intel's build, we should see this:

image

I've written the main function of the project so that it's easy to set a breakpoint where things go wrong (see line 29):

image

This should confirm that there is indeed a problem with zlib and not the websockets source code.

ZLibConsoleApp.zip

/cc @CarnaViire, @stephentoub

@stephentoub
Copy link
Member

I can confirm this only happens with the intel version of the zlib library

@ericstj

@zlatanov
Copy link
Contributor Author

@stephentoub After looking around I found this:

if (hash_head != 0 && s->strstart - hash_head <= MAX_DIST2) {

if (hash_head != 0 && s->strstart - hash_head <= MAX_DIST2) {

If we replace the usage of the MAX_DIST2 constant (#define MAX_DIST2 ((1 << MAX_WBITS) - MIN_LOOKAHEAD)) and use MAX_DIST(s) instead (#define MAX_DIST(s) ((s)->w_size-MIN_LOOKAHEAD)), which takes into account the window size, the error goes away.

I am also unsure if this is correct too:

limit = next->strstart > MAX_DIST2 ? next->strstart - MAX_DIST2 : 0;

To me it seems this would work only when the window size is 32KB.

@ericstj
Copy link
Member

ericstj commented Apr 19, 2021

cc @jtkukunas for zlib-intel

@zlatanov
Copy link
Contributor Author

Now that the PR for this bug has been merged in the original repo, how do you suggest we proceed so we can bring it here as well? Should I create a PR for it here, or do you have other procedure for keeping the zlib-intel version in sync with the original repo?

/cc @ericstj @stephentoub @carlossanlop

@jtkukunas
Copy link

I can produce a new release of intel zlib, which will make it easier to consume.

@ericstj
Copy link
Member

ericstj commented Apr 23, 2021

A new release will give a tag to reference.

Last time I did this I think I used patch files to port the changes. @vkvenkat set it up originally IIRC.

@CarnaViire
Copy link
Member

@jkotas You mentioned here #49194 (comment) that the sources are somehow refreshed once in a while. Is it possibly an automated process for updating? Can we just run it after @jtkukunas creates a new release?

@jkotas
Copy link
Member

jkotas commented Apr 23, 2021

Is it possibly an automated process for updating? Can we just run it after @jtkukunas creates a new release?

It is a question for area-System.IO.Compression area owners - @adamsitnik @carlossanlop @jozkee according to https://github.com/dotnet/runtime/blob/main/docs/area-owners.md - on whether they believe that automating this process would be a net win long term.

@CarnaViire
Copy link
Member

Ok, I just got an impression that an automated process exists. If there's none, I guess we'd have to update the sources manually...
@jtkukunas when will you be able to create a release tag, please?

@jozkee jozkee modified the milestones: Future, 6.0.0 Apr 23, 2021
@jozkee jozkee removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Apr 23, 2021
@ericstj
Copy link
Member

ericstj commented Apr 23, 2021

No, nothing automatic: changes don't happen often. Don't do it "manually", create patch files for the last commits and apply them in a PR.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 26, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 26, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants