-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorize Convert.ToBase64CharArray and TryToBase64Chars #73320
Conversation
A previous PR vectorized Convert.ToBase64String for larger inputs by using Base64.EncodeToUTF8 and then encoding the result UTF8 bytes into a UTF16 string. It did not touch Convert.ToBase64CharArray nor Convert.TryToBase64Chars, however. The ToBase64String change makes use of a temporary array rented from the array pool, and the expectation is it'll rarely allocate, but if it does, it's part of a method that's already allocating the resulting string and so it's presumed to not be too impactful. ToBase64CharArray and TryToBase64Chars, however, are intended to be entirely non-allocating, and so even renting from the array pool would be problematic. This PR changes the non-allocating variants to use Base64.EncodeToUtf8 as well. But instead of renting a temporary buffer, it banks on the knowledge that the encoded Base64 bytes are 1/2 the length of the resulting chars, since the bytes are all guaranteed to be ASCII. Thus, it can treat the destination char buffer as scratch space for the encoded UTF8 bytes, and then widen in-place. This obviates the need for a separate temporary buffer, making it appropriate for the non-allocating versions. And once we had the helper for those, we can use that same helper to replace the code added to ToBase64String, making it non-allocating as well (beyond of course the result string it has to allocate by its nature), and thus making it more predictable. Overall, this fixes the possible additional allocation in ToBase64String as well as the performance inversion that the allocating ToBase64String could have been significantly faster (due to vectorization) than the ToBase64CharArray and ToBase64Chars methods intended to be the faster versions.
Tagging subscribers to this area: @dotnet/area-system-runtime Issue Details#71795 vectorized Convert.ToBase64String for larger inputs by using Base64.EncodeToUTF8 and then encoding the result UTF8 bytes into a UTF16 string. It did not touch Convert.ToBase64CharArray nor Convert.TryToBase64Chars, however. The ToBase64String change makes use of a temporary array rented from the array pool, and the expectation is it'll rarely allocate, but if it does, it's part of a method that's already allocating the resulting string and so it's presumed to not be too impactful. ToBase64CharArray and TryToBase64Chars, however, are intended to be entirely non-allocating, and so even renting from the array pool could be problematic if it fails to find a buffer in the pool. This PR changes the non-allocating variants to use Base64.EncodeToUtf8 as well. But instead of renting a temporary buffer, it banks on the knowledge that the encoded Base64 bytes are 1/2 the length of the resulting chars, since the bytes are all guaranteed to be ASCII. Thus, it can treat the destination char buffer as scratch space for the encoded UTF8 bytes, and then widen in-place. This obviates the need for a separate temporary buffer, making it appropriate for the non-allocating versions. And once we had the helper for those, we can use that same helper to replace the code added to ToBase64String, making it non-allocating as well (beyond of course the result string it has to allocate by its nature), and thus making it more predictable. Overall, this fixes the possible additional allocation in ToBase64String as well as the performance inversion that the allocating ToBase64String could have been significantly faster (due to vectorization) than the ToBase64CharArray and ToBase64Chars methods intended to be the faster versions. [Params(16, 64, 256, 1024)]
public int Length { get; set; }
private byte[] _data;
private char[] _scratch;
[GlobalSetup]
public void Setup()
{
_data = new byte[Length];
_scratch = new char[Length * 4];
var r = new Random(42);
r.NextBytes(_data);
}
[Benchmark]
public string ToBase64String() => Convert.ToBase64String(_data);
[Benchmark]
public void ToBase64CharArray() => Convert.ToBase64CharArray(_data, 0, _data.Length, _scratch, 0);
[Benchmark]
public void ToBase64Chars() => Convert.TryToBase64Chars(_data, _scratch, out _);
|
[Params(8, 16, 22, 32, 46, 64, 70, 128, 256, 1024)]
public int Length { get; set; }
private byte[] _data;
private char[] _scratch;
[GlobalSetup]
public void Setup()
{
_data = new byte[Length];
_scratch = new char[Length * 4];
var r = new Random(42);
r.NextBytes(_data);
}
[Benchmark]
public string ToBase64String() => Convert.ToBase64String(_data);
[Benchmark]
public void ToBase64CharArray() => Convert.ToBase64CharArray(_data, 0, _data.Length, _scratch, 0);
|
Failure is #73247 |
linux/arm64 improvements dotnet/perf-autofiling-issues#7250 |
windows/arm64 improvements dotnet/perf-autofiling-issues#7244 |
#71795 vectorized Convert.ToBase64String for larger inputs by using Base64.EncodeToUTF8 and then encoding the result UTF8 bytes into a UTF16 string. It did not touch Convert.ToBase64CharArray nor Convert.TryToBase64Chars, however. The ToBase64String change makes use of a temporary array rented from the array pool, and the expectation is it'll rarely allocate, but if it does, it's part of a method that's already allocating the resulting string and so it's presumed to not be too impactful. ToBase64CharArray and TryToBase64Chars, however, are intended to be entirely non-allocating, and so even renting from the array pool could be problematic if it fails to find a buffer in the pool.
This PR changes the non-allocating variants to use Base64.EncodeToUtf8 as well. But instead of renting a temporary buffer, it banks on the knowledge that the encoded Base64 bytes are 1/2 the length of the resulting chars, since the bytes are all guaranteed to be ASCII. Thus, it can treat the destination char buffer as scratch space for the encoded UTF8 bytes, and then widen in-place. This obviates the need for a separate temporary buffer, making it appropriate for the non-allocating versions. And once we had the helper for those, we can use that same helper to replace the code added to ToBase64String, making it non-allocating as well (beyond of course the result string it has to allocate by its nature), and thus making it more predictable.
Overall, this fixes the possible additional allocation in ToBase64String as well as the performance inversion that the allocating ToBase64String could have been significantly faster (due to vectorization) than the ToBase64CharArray and ToBase64Chars methods intended to be the faster versions.