Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality for serializing depot manifests #1113

Merged
merged 1 commit into from
Sep 13, 2022

Conversation

NicknineTheEagle
Copy link
Contributor

@NicknineTheEagle NicknineTheEagle commented Jul 17, 2022

This adds functionality for serializing depot manifests into Steam-compatible protobuf files. The output matches .manifest files found in Steam's depotcache folder unless there are non-ASCII filenames present. It seems that Steam handles filename sorting slightly differently, not quite sure what's going on there.

I'm intending to use this as a stepping stone towards migrating DepotDownloader to using Steam protobuf manifests.

@NicknineTheEagle NicknineTheEagle force-pushed the nn/manifest branch 3 times, most recently from 79fea9f to 9094e44 Compare July 18, 2022 15:31
@psychonic psychonic merged commit 3941fab into SteamRE:master Sep 13, 2022
@NicknineTheEagle
Copy link
Contributor Author

@psychonic How soon is the next SteamKit release? Want to know when I can integrate these changes into DepotDownloader.

@psychonic
Copy link
Member

I imagine not until the new Steam authentication is worked out and added. (#1125 / #1129), but I don't really manage the releases so can't say for sure.

@yaakov-h
Copy link
Member

Usually its whenever I decide to publish one 😄

@xPaw xPaw linked an issue Mar 21, 2023 that may be closed by this pull request
@xPaw xPaw added this to the 2.5.0 milestone Mar 21, 2023
@skylayer
Copy link
Contributor

Thank you for your contribution to the serialization logic of DepotManifest. I appreciate the effort you've put into implementing this feature. However, I've identified a critical performance issue within your implementation, specifically related to how unique chunks are determined:

if ( !uniqueChunks.Exists( x => x.SequenceEqual( chunk.ChunkID! ) ) )
{
    uniqueChunks.Add( chunk.ChunkID! );
}

This approach, unfortunately, leads to each existence check having a time complexity of O(N), resulting in an overall complexity of O(N^2) for this section of the code. This becomes significantly problematic for large manifests, such as those exceeding 100GB, where the serialization process could extend up to 30 minutes or more.

To address this issue and enhance the performance, I suggest utilizing a HashSet alongside a custom IEqualityComparer<byte[]>. This modification ensures that each check operates in near O(1) time, substantially reducing the total serialization time. Below is a proposed implementation:

class ByteArrayComparer : IEqualityComparer<byte[]>
{
    public bool Equals(byte[] x, byte[] y)
    {
        if (ReferenceEquals(x, y)) return true;
        if (x == null || y == null) return false;
        return x.SequenceEqual(y);
    }

    public int GetHashCode(byte[] obj)
    {
        if (obj == null) throw new ArgumentNullException(nameof(obj));
        // Adjust the hash code calculation as needed to ensure uniform distribution
        return BitConverter.ToInt32(obj, 0);
    }
}

// Usage...

var uniqueChunks = new HashSet<byte[]>(new ByteArrayComparer());

foreach (var chunk in file.Chunks)
{
    var protochunk = new ContentManifestPayload.FileMapping.ChunkData()
    {
        sha = chunk.ChunkID,
        crc = BitConverter.ToUInt32(chunk.Checksum!, 0),
        offset = chunk.Offset,
        cb_original = chunk.UncompressedLength,
        cb_compressed = chunk.CompressedLength
    };

    protofile.chunks.Add(protochunk);
    // Directly use HashSet.Add for O(1) complexity
    uniqueChunks.Add(chunk.ChunkID!);
}

By implementing this solution, we can significantly improve the performance of the serialization process for large depot manifests. I hope you find this suggestion helpful and consider integrating it into your pull request. Thank you again for your valuable contribution.

foreach ( var file in Files )
{
var protofile = new ContentManifestPayload.FileMapping();
protofile.filename = file.FileName.Replace( '/', '\\' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was incorrectly replacing slashes if the filename was not decrypted.

Discovered while adding a roundtrip test: 7559904

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Ability to serialize manifests
5 participants