-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PERF] DaysInMonth optimization #95229
Conversation
Tagging subscribers to this area: @dotnet/area-system-datetime Issue DetailsThe current code to calculate the days in a month uses a While looking at this issue, I came up with an alternative approach. Every month has at least 28 days with 0-3 "extra days", so to encode the total number of days in each month, we only need 2 bits per month times 12 months for a total of 24 bits which conveniently fits inside an Benchmarking this was a bit difficult, so I tried my best to create a realistic scenario. Micro-benchmarking this code had some mixed results (AOT was always faster, but non-AOT was usually equivalent). Also note, prior to net8, the codegen for this PR wasn't optimal unless I reorganized the variables/calculations, but this seems to be fixed now. This is my first PR to the runtime, so please let me know if I need to change anything. The [SimpleJob(runtimeMoniker: RuntimeMoniker.Net80)]
[SimpleJob(runtimeMoniker: RuntimeMoniker.NativeAot80)]
public class GetDaysInMonth
{
private static ReadOnlySpan<byte> DaysInMonth365 => [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31];
private static ReadOnlySpan<byte> DaysInMonth366 => [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31];
const int DaysInMonth = 0b_11_10_11_10_11_11_10_11_10_11_00_11_00;
const int DaysInMonthLeapYear = DaysInMonth | 0x10;
const byte ExtraDaysMask = 0x3;
static int[] years = { 2023, 2024, 2025, 2026, 2027, 2028, 2029, 2030, 2031, 2032 };
static int[] months = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
static (int Year, int Month)[] loop = ( from x in years
from m in months
select (x, m)).ToArray();
[Benchmark(Baseline = true)]
public void Base() { foreach (var x in loop) Base(x.Year, x.Month); }
[Benchmark]
public void PR() { foreach (var x in loop) PR(x.Year, x.Month); }
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int Base(int year, int month)
{
if (month < 1 || month > 12) { throw new ArgumentOutOfRangeException(nameof(month)); }
return (IsLeapYear(year) ? DaysInMonth366 : DaysInMonth365)[month - 1];
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int PR(int year, int month)
{
if (month < 1 || month > 12) { throw new ArgumentOutOfRangeException(nameof(month)); }
int days = IsLeapYear(year) ? DaysInMonthLeapYear : DaysInMonth;
return 28 + (days >> (2 * month) & ExtraDaysMask);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsLeapYear(int year)
{
// zero or negative bit set or
if (year < 1 || year > 9999)
{
throw new ArgumentOutOfRangeException();
}
if ((year & 3) != 0) return false;
if ((year & 15) == 0) return true;
return (uint)year % 25 != 0;
}
} BenchmarkDotNet v0.13.10, Windows 11 (10.0.22621.2715/22H2/2022Update/SunValley2)
13th Gen Intel Core i5-13600K, 1 CPU, 20 logical and 14 physical cores
.NET SDK 8.0.100
[Host] : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
.NET 8.0 : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
NativeAOT 8.0 : .NET 8.0.0, X64 NativeAOT AVX2
| Method | Job | Runtime | Mean | Error | StdDev | Ratio |
|------- |-------------- |-------------- |----------:|---------:|---------:|------:|
| Base | .NET 8.0 | .NET 8.0 | 110.05 ns | 0.376 ns | 0.352 ns | 1.00 |
| PR | .NET 8.0 | .NET 8.0 | 82.28 ns | 0.155 ns | 0.145 ns | 0.75 |
| | | | | | | |
| Base | NativeAOT 8.0 | NativeAOT 8.0 | 115.87 ns | 0.187 ns | 0.175 ns | 1.00 |
| PR | NativeAOT 8.0 | NativeAOT 8.0 | 71.51 ns | 0.091 ns | 0.085 ns | 0.62 |
|
@@ -1170,7 +1171,8 @@ public static int DaysInMonth(int year, int month) | |||
{ | |||
if (month < 1 || month > 12) ThrowHelper.ThrowArgumentOutOfRange_Month(month); | |||
// IsLeapYear checks the year argument | |||
return (IsLeapYear(year) ? DaysInMonth366 : DaysInMonth365)[month - 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (IsLeapYear(year) ? DaysInMonth366 : DaysInMonth365)[month - 1];
I thought the jit should be optimizing the span indexing. Maybe something we can try instead of doing subtract 1 every time, we can have the span contents starts with 0 and avoid the subtraction operation and measure.
Also, will be good to do the measurements on the real code and try to measure DateTime.DaysInMonth with and without the changes instead of writing a separate code mimic the library's code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @stephentoub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JIT can optimize and constant fold accesses to RVA static data (like the DaysInMonth365
/DaysInMonth366
data currently is).
Altogether, this new code seems a lot less readable for what is barely 1/5th of a nanosecond faster (basically 1-2 instruction cycles).
This is unlikely to make any difference in any real world code and I personally don't think the additional complexity is worth it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tannergooding here, if the perf gain is not going to make real difference in the public scenario, it is not worth the complexity.
@tarekgh If I try to benchmark [SimpleJob(runtimeMoniker: RuntimeMoniker.NativeAot80)]
[SimpleJob(runtimeMoniker: RuntimeMoniker.Net80)]
public class GetDaysInMonth
{
private static ReadOnlySpan<byte> DaysInMonth365 => [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31];
const int DaysInMonth = 0b_11_10_11_10_11_11_10_11_10_11_00_11_00;
const byte ExtraDaysMask = 0b11;
[MethodImpl(MethodImplOptions.NoInlining)]
private int GetMonth() => 1;
[Benchmark(Baseline = true)]
public int Base() => DaysInMonth365[GetMonth()];
[Benchmark]
public int PR() => 28 + (DaysInMonth >> (2 * GetMonth()) & ExtraDaysMask);
[Benchmark]
public int MaskShiftHack()
{
var month = GetMonth();
var shift = 2 * month;
return 28 + (DaysInMonth >> shift & ExtraDaysMask);
}
} | Method | Job | Runtime | Mean | Error | StdDev | Ratio |
|-------------- |-------------- |-------------- |----------:|----------:|----------:|------:|
| Base | .NET 8.0 | .NET 8.0 | 1.1249 ns | 0.0051 ns | 0.0047 ns | 1.00 |
| PR | .NET 8.0 | .NET 8.0 | 1.1224 ns | 0.0034 ns | 0.0032 ns | 1.00 |
| MaskShiftHack | .NET 8.0 | .NET 8.0 | 0.9232 ns | 0.0018 ns | 0.0017 ns | 0.82 |
| | | | | | | |
| Base | NativeAOT 8.0 | NativeAOT 8.0 | 0.9435 ns | 0.0036 ns | 0.0033 ns | 1.00 |
| PR | NativeAOT 8.0 | NativeAOT 8.0 | 0.8051 ns | 0.0029 ns | 0.0028 ns | 0.85 |
| MaskShiftHack | NativeAOT 8.0 | NativeAOT 8.0 | 0.9395 ns | 0.0023 ns | 0.0022 ns | 1.00 | @jakobbotsch There may be some codegen issue here because I assume |
Normally, we benchmark public end-points, and the algorithm is:
Benchmarking of internal parts separately often doesn't tell real impact of a change. |
@EgorBo I'm having difficulty when running
I don't have version |
This is a known pain with MSVC update. You can delete all |
I am closing this PR. We can consider reopening it if proven it is worth having this optimization. Thanks! |
The current code to calculate the days in a month uses a
ReadOnlySpan<byte>
which is accessed by it's index (month). This was optimized in net8 due to some new inlining feature and is blazing fast now.While looking at this issue, I came up with an alternative approach. Every month has at least 28 days with 0-3 "extra days", so to encode the total number of days in each month, we only need 2 bits per month times 12 months for a total of 24 bits which conveniently fits inside an
int
. Applying a right shift and a mask of the 2 least significant bits computes the extra days for a given month, and then we just add 28 to get the total.Benchmarking this was a bit difficult, so I tried my best to create a realistic scenario. Micro-benchmarking this code had some mixed results (AOT was always faster, but non-AOT was usually equivalent). Also note, prior to net8, the codegen for this PR wasn't optimal unless I reorganized the variables/calculations, but this seems to be fixed now. This is my first PR to the runtime, so please let me know if I need to change anything. The
DaysInMonth
variable is expressed in binary and theExtraDaysMask
as hex, and I'm not sure which is preferred. Here are the benchmarks and results.