-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements of Random.GetItems<T> performance #82286
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime Issue Details.NET 8 introduces a new API for generating random sequences from the specified choices with runtime/src/libraries/System.Private.CoreLib/src/System/Random.cs Lines 190 to 199 in 05e1fe1
This approach is implemented in my open-source library, which gives the following differences in the benchmark:
, where Instead of calling Here is implementation of a whole algorithm from my library (MIT licensed). For a relatively small If this looks interesting, I'm ready to adopt and port the implementation. P.S.: Benchmark results are obtained on .NET SDK 6.0.406 Linux
|
See #79790 let’s merge that first |
An alternative would be making
We could explore the approach of having a
I would not want to use unsafe code here as in your example for indexing into choices, in particular if the values being randomly generated aren't controlled by us. In the existing implementation, for example, it's possible for a derived implementation to return a value from Next that's erroneously out of range; today that would result in an exception, but if the bounds checks were removed explicitly, it could result in walking off the span. |
True. But in my example, all we need is to make |
Understood. Which leads into my previous comment about ArrayPool allocation. If we want to explore making a single call for sizes we're willing to stack allocate, that'd be ok, although it'd then be weird if we used a derived Random's NextBytes some of the time and Next(int) other times. |
@danmoseley , #79790 has been merged. I'm ready to proceed with this task. Could you assign it to me? |
Preliminary benchmark results (new changes from PR are taken into account):
Host : .NET 6.0.14 (6.0.1423.7309), X64 RyuJIT AVX2 Source code for the benchmarkusing BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Engines;
using BenchmarkDotNet.Order;
using System;
using System.Runtime.CompilerServices;
namespace DotNext;
[SimpleJob(runStrategy: RunStrategy.Throughput, launchCount: 1)]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
public class RandomStringBenchmark
{
[Params("1234567890abcdef", "1234567890abcdefg")]
public string AllowedChars;
private readonly Random rnd = new();
[Benchmark]
public void GetItemsOptimized()
{
Span<char> destination = stackalloc char[36];
RandomExtensions.NextChars(rnd, AllowedChars, destination);
}
[Benchmark(Baseline = true)]
public void GetItemsFromDotNet()
{
Span<char> destination = stackalloc char[36];
GetItems(rnd, AllowedChars, destination);
static void GetItems<T>(Random rnd, ReadOnlySpan<T> choices, Span<T> destination)
{
if (choices.IsEmpty)
{
throw new ArgumentException(nameof(choices));
}
for (int i = 0; i < destination.Length; i++)
{
destination[i] = choices[(int)NextUInt32((uint)choices.Length, rnd)];
}
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static uint NextUInt32(uint maxValue, Random rnd)
{
ulong randomProduct = (ulong)maxValue * (uint)rnd.Next();
uint lowPart = (uint)randomProduct;
if (lowPart < maxValue)
{
uint remainder = (0u - maxValue) % maxValue;
while (lowPart < remainder)
{
randomProduct = (ulong)maxValue * (uint)rnd.Next();
lowPart = (uint)randomProduct;
}
}
return (uint)(randomProduct >> 32);
}
} |
@sakno are you still working on this? would be a nice win. |
@danmoseley , yes, I'm working on it. But now this work is on pause, I'm waiting for #83305 PR because it may have impact on my work: merge conflicts, new algorithm (which probably can be more efficient that my proposal). |
.NET 8 introduces a new API for generating random sequences from the specified choices with
Random.GetItems<T>
andRandomNumberGenerator.GetItems<T>
methods. I found that the recently submitted implementation is not as fast as it can be:runtime/src/libraries/System.Private.CoreLib/src/System/Random.cs
Lines 190 to 199 in 05e1fe1
Next
is method is virtual and called within the loop that prevents it from inlining (as well as loop inside ofXoshiroImpl.Next
prevents this). Moreover,XoshiroImpl
generates 64-bit number efficiently, but the current implementation only uses 32-bit (because span index is of typeint
). Thus, a half or random bits just dropped. If we can fully reuse all 64 bits, the performance can be improved twice.This approach is implemented in my open-source library, which gives the following differences in the benchmark:
, where
AllowedChars
is achoices
span. The length of the destination buffer is 36. Whenchoices
length is a power of 2, it is possible to perform additional optimizations that give even better performance.Instead of calling
Next
method, I prefer to generate a vector of random 32-bit integers only once usingNextBytes
. Each element of the vector represents an index withinchoices
. All I need here is to preserve range of each index (when length is a power of 2, modulo can be replaced effectively with bitwise AND). Otherwise, adjusting index can be done without division (or modulo) operator and the loop as described by Daniel Lemire in his paper on Arxiv.Here is implementation of a whole algorithm from my library (MIT licensed).
For a relatively small
destination
, it is possible to allocate buffer on the stack. Otherwise,ArrayPool
can be used to rent the necessary vector.If this looks interesting, I'm ready to adopt and port the implementation.
P.S.: Benchmark results are obtained on .NET SDK 6.0.406 Linux
The text was updated successfully, but these errors were encountered: