Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADO.NET IHashPicker customization API + Orleans v3-compatible IHashPicker implementation #9217

Merged
merged 13 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
using System;
using System.Text;

namespace Orleans.Storage
{
Expand All @@ -22,12 +21,6 @@ private static void Mix(ref uint a, ref uint b, ref uint c)
c -= a; c -= b; c ^= (b >> 15);
}

// This is the reference implementation of the Jenkins hash.
public static uint ComputeHash(byte[] data)
{
return ComputeHash(data.AsSpan());
}

// This is the reference implementation of the Jenkins hash.
public static uint ComputeHash(ReadOnlySpan<byte> data)
{
Expand Down Expand Up @@ -79,36 +72,5 @@ public static uint ComputeHash(ReadOnlySpan<byte> data)
Mix(ref a, ref b, ref c);
return c;
}

public static uint ComputeHash(string data)
{
byte[] bytesToHash = Encoding.UTF8.GetBytes(data);
return ComputeHash(bytesToHash);
}

// This implementation calculates the exact same hash value as the above, but is
// optimized for the case where the input is exactly 24 bytes of data provided as
// three 8-byte unsigned integers.
public static uint ComputeHash(ulong u1, ulong u2, ulong u3)
{
uint a = 0x9e3779b9;
uint b = a;
uint c = 0;

unchecked
{
a += (uint)u1;
b += (uint)((u1 ^ (uint)u1) >> 32);
c += (uint)u2;
Mix(ref a, ref b, ref c);
a += (uint)((u2 ^ (uint)u2) >> 32);
b += (uint)u3;
c += (uint)((u3 ^ (uint)u3) >> 32);
}
Mix(ref a, ref b, ref c);
c += 24;
Mix(ref a, ref b, ref c);
return c;
}
}
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
using Orleans.Storage;
using System;

namespace Orleans.Storage
{
Expand All @@ -15,7 +15,12 @@ internal class Orleans3CompatibleHasher : IHasher
/// <summary>
/// <see cref="IHasher.Hash(byte[])"/>.
/// </summary>
public int Hash(byte[] data)
public int Hash(byte[] data) => Hash(data.AsSpan());

/// <summary>
/// <see cref="IHasher.Hash(byte[])"/>.
/// </summary>
public int Hash(ReadOnlySpan<byte> data)
{
// implementation restored from Orleans v3.7.2: https://github.com/dotnet/orleans/blob/b24e446abfd883f0e4ed614f5267eaa3331548dc/src/AdoNet/Orleans.Persistence.AdoNet/Storage/Provider/OrleansDefaultHasher.cs
return unchecked((int)JenkinsHash.ComputeHash(data));
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using System;
using System.Buffers;
using System.Text;

namespace Orleans.Storage
Expand Down Expand Up @@ -37,17 +38,63 @@ public int Hash(byte[] data)
// PickHasher parameters are the same for both calls so we need to analyze data content to distinguish these cases.
// It doesn't word if string key is equal to grain type name, but we consider this edge case to be negligibly rare.

// reducing allocations if data is not a grain type
if (data.Length >= _grainType.Length && Encoding.UTF8.GetByteCount(_grainType) == data.Length)
if (IsGrainTypeName(data))
return _innerHasher.Hash(data);

var extendedLength = data.Length + 8;
if (extendedLength <= 256)
{
Span<byte> extended = stackalloc byte[extendedLength];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack-allocated should be to a constant of power of 2, then sliced. It's cheaper.
The extra space (extended) should also be cleared, just to be safe (see SkipLocalsInit).

data.AsSpan().CopyTo(extended);
return _innerHasher.Hash(extended);
}

var buffer = ArrayPool<byte>.Shared.Rent(extendedLength);
try
{
data.AsSpan().CopyTo(buffer);
Array.Clear(buffer, data.Length, 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Clear needed?
For the stack-alloc path above you can safely assume that the stack-space is zeroed. E.g. when there's [SkipLocalsInit] applied.

I'd combine the two pathes, to avoid the duplicated logic (though it's very little logic here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we need to clear rented buffer to make sure all these bytes set to zero - rented buffer can contain arbitrary data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I misread the code here. The extra 8 bytes need to be zeroed.
Thanks for the hint!

As the span also needs to be cleared, these two code-pathes should be collapsed (see other comments).

return _innerHasher.Hash(buffer.AsSpan(0, extendedLength));
}
finally
{
ArrayPool<byte>.Shared.Return(buffer);
}
}

private bool IsGrainTypeName(byte[] data)
{
// at least 1 byte per char
if (data.Length < _grainType.Length)
return false;

var grainTypeByteCount = Encoding.UTF8.GetByteCount(_grainType);
if (grainTypeByteCount != data.Length)
return false;

if (grainTypeByteCount <= 256)
{
var grainTypeBytes = Encoding.UTF8.GetBytes(_grainType);
if (grainTypeBytes.AsSpan().SequenceEqual(data))
return _innerHasher.Hash(data);
Span<byte> grainTypeBytes = stackalloc byte[grainTypeByteCount];
if (!Encoding.UTF8.TryGetBytes(_grainType, grainTypeBytes, out _))
throw new InvalidOperationException();

return grainTypeBytes.SequenceEqual(data);
}

var extendedData = new byte[data.Length + 8];
data.CopyTo(extendedData, 0);
return _innerHasher.Hash(extendedData);
var buffer = ArrayPool<byte>.Shared.Rent(grainTypeByteCount);
try
{
var grainTypeBytes = buffer.AsSpan(0, grainTypeByteCount);

if (!Encoding.UTF8.TryGetBytes(_grainType, grainTypeBytes, out _))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, combine the code paths.

throw new InvalidOperationException();

return grainTypeBytes.SequenceEqual(data);
}
finally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally isn't needed. In case of failure the rented array will just be dropped (on the GCed). The array-pool is tolerant to this.
W/o finally there's also room for the JIT to do more optimizations.

The pattern w/ try-finally for the array pool got used some time ago in .NET, but it isn't anymore.
(Same above)

{
ArrayPool<byte>.Shared.Return(buffer);
}
}
}
}