Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks for HashSet, Dictionary comparison in .NET core 3.1, .NET 4.8, and .NET Core 5.0 #2473

Closed

Conversation

Yun-Ting
Copy link
Contributor

Changes

Resolving comment: #2467 (comment)

HashSet has better performance in terms of time and space consumption for the use case in the referenced code review.
Please find the detailed benchmarks for .NET core 3.1, .NET 4.8, and .NET Core 5.0 as below.

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19043
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.401
[Host] : .NET Core 3.1.19 (CoreCLR 4.700.21.41101, CoreFX 4.700.21.41603), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET Core 3.1.19 (CoreCLR 4.700.21.41101, CoreFX 4.700.21.41603), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
HashSet 6.781 ms 0.1338 ms 0.2734 ms 117.1875 117.1875 117.1875 657.27 KB
Dictionary 7.540 ms 0.1929 ms 0.5689 ms 218.7500 210.9375 195.3125 919.92 KB

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19043
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
[Host] : .NET Framework 4.8 (4.8.4300.0), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET Framework 4.8 (4.8.4300.0), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
HashSet 9.826 ms 0.1956 ms 0.4906 ms 109.3750 109.3750 109.3750 657.68 KB
Dictionary 10.236 ms 0.2724 ms 0.7989 ms 203.1250 203.1250 187.5000 921.22 KB

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19043
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.401
[Host] : .NET Core 5.0.10 (CoreCLR 5.0.1021.41214, CoreFX 5.0.1021.41214), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET Core 5.0.10 (CoreCLR 5.0.1021.41214, CoreFX 5.0.1021.41214), X64 RyuJIT

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
HashSet 5.530 ms 0.1089 ms 0.2568 ms 117.1875 117.1875 117.1875 657.28 KB
Dictionary 5.583 ms 0.1096 ms 0.1769 ms 218.7500 203.1250 195.3125 919.92 KB

@codecov
Copy link

codecov bot commented Oct 11, 2021

Codecov Report

Merging #2473 (5934dcd) into main (90d2906) will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2473      +/-   ##
==========================================
- Coverage   79.71%   79.67%   -0.04%     
==========================================
  Files         254      254              
  Lines        8404     8404              
==========================================
- Hits         6699     6696       -3     
- Misses       1705     1708       +3     
Impacted Files Coverage Δ
...Zipkin/Implementation/ZipkinExporterEventSource.cs 63.63% <0.00%> (-9.10%) ⬇️
...ter.ZPages/Implementation/ZPagesActivityTracker.cs 97.14% <0.00%> (-2.86%) ⬇️
...nTelemetry/Internal/OpenTelemetrySdkEventSource.cs 71.02% <0.00%> (-2.81%) ⬇️
...emetry.Api/Internal/OpenTelemetryApiEventSource.cs 82.35% <0.00%> (+5.88%) ⬆️

[Benchmark]
public void HashSet()
{
var testSet = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for our usecase, we are concerned with the lookup cost only. (As creation/population is a one-time operation, vs lookup being performed everytime a new Meter/ActivitySource is created).
Can you modify the benchmark to create/populate the dictionary(hashset) in the setup phase, and do the lookup in the Benchmark method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, please do run the benchmarks for older .NET Frameworks. Net461 onwards

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own curiosity I tried net462 and net5.0, both are doing reads only (I tested both the hit and miss cases).

net462:

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
HashSet 40.21 ns 1.780 ns 4.961 ns 40.50 ns - - - -
Dictionary 49.96 ns 2.447 ns 6.820 ns 47.23 ns - - - -

net5.0:

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
HashSet 52.78 ns 1.098 ns 2.567 ns 52.26 ns - - - -
Dictionary 64.45 ns 2.275 ns 6.453 ns 62.59 ns - - - -

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to use net40 and earlier benchmarks but haven't figured out the dependency issue with the correct benchmarkdotnet version. I'll update the thread if I have stats for earlier versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that what Cijo meant is that versions greater than >= 461 (those are the versions our sdk supports.)
I misunderstood it to be versions <= 461 because I had this perception that earlier frameworks would have worse perf for set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just providing some context #708 (comment).

I think (based on the numbers I got) the HashSet read operation doesn't have perf issue on all the versions of runtime that OpenTelemetry .NET is targeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested net471 for lookups with commit: 876c000
I noticed that Dictionary did win over HashSet by 0.03%.
But from @reyang 's data, it seems that for net462 and net5.0, Dictionary performs better.
I guess if only in some specific .NET frameworks that Dictionary wins over by less than 0.05%. I would go for hashset, based on its constantly win for space consumption.
What are your thoughts?

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1237 (21H1/May2021Update)
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
[Host] : .NET Framework 4.8 (4.8.4300.0), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET Framework 4.8 (4.8.4300.0), X64 RyuJIT

Method Mean Error StdDev Allocated
HashSet 4.879 ms 0.0998 ms 0.2942 ms -
Dictionary 4.708 ms 0.0948 ms 0.2794 ms -

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on benchmarks, we can be using HashSet, and not Dictionary as we don't have any noticeable perf issues for the target frameworks we are targeting, and for our scenario. (reads only after startup)
Also HashSet needs less space.

@cijothomas
Copy link
Member

We can close this, as we measured what we wanted to, and this benchmark is not required to be merged to this repo.

@Yun-Ting Yun-Ting closed this Oct 13, 2021
@Yun-Ting Yun-Ting deleted the Yun-Ting/HashSet-Dict-Benchmark branch October 13, 2021 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants