Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64: Revisit the heuristics for IO completion poller threads #67266

Open
kunalspathak opened this issue Mar 28, 2022 · 9 comments
Open

Arm64: Revisit the heuristics for IO completion poller threads #67266

kunalspathak opened this issue Mar 28, 2022 · 9 comments
Assignees
Milestone

Comments

@kunalspathak
Copy link
Member

kunalspathak commented Mar 28, 2022

Today, depending on the number of processors, we create IO completion poller threads. Back then, we did our analysis on older Arm64 machines to come up with the heuristic and they can be different on modern Arm64 machines. We need to revisit them or have some kind of auto-tuning of number of threads creation.

Architecture architecture = RuntimeInformation.ProcessArchitecture;
int coresPerEngine = architecture == Architecture.Arm64 || architecture == Architecture.Arm
? 8
: 30;
return Math.Max(1, (int)Math.Round(Environment.ProcessorCount / (double)coresPerEngine));

int processorsPerPoller =
AppContextConfigHelper.GetInt32Config("System.Threading.ThreadPool.ProcessorsPerIOPollerThread", 12, false);
return (Environment.ProcessorCount - 1) / processorsPerPoller + 1;

Related: #67180

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Mar 28, 2022
@kunalspathak
Copy link
Member Author

@mangod9

@ghost
Copy link

ghost commented Mar 28, 2022

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

Today, depending on the number of processors, we create IO completion poller threads. Back then, we did our analysis on older Arm64 machines to come up with the heuristic and they can be different on modern Arm64 machines. We need to revisit them or have some kind of auto-tuning of number of threads creation.

Architecture architecture = RuntimeInformation.ProcessArchitecture;
int coresPerEngine = architecture == Architecture.Arm64 || architecture == Architecture.Arm
? 8
: 30;
return Math.Max(1, (int)Math.Round(Environment.ProcessorCount / (double)coresPerEngine));

int processorsPerPoller =
AppContextConfigHelper.GetInt32Config("System.Threading.ThreadPool.ProcessorsPerIOPollerThread", 12, false);
return (Environment.ProcessorCount - 1) / processorsPerPoller + 1;

Related: #67180

Author: kunalspathak
Assignees: -
Labels:

area-System.Threading, untriaged

Milestone: -

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Mar 28, 2022
@mangod9 mangod9 added this to the 7.0.0 milestone Mar 28, 2022
@mangod9
Copy link
Member

mangod9 commented Mar 28, 2022

@kouvel as fyi.

@MihaZupan
Copy link
Member

cc: @dotnet/ncl

@adamsitnik
Copy link
Member

FWIW I am the person who came up with the magic numbers on Unix. Here you can find the data and reasoning behind it.

Recently after a conversation with @kunalspathak I wanted to give it a quick try and see how the number of epoll threads affects the throughput of JSON Platform benchmark on the Ampere machine. Currently more than 80% of time is spent in ConcurrentDictionary.TryDequeue and we need to solve this blocker (#67845, dotnet/aspnetcore#40476) before we try to change the number of epoll threads.

arm64 with 80 cores::

image

x64 with 28 cores:

image

@kunalspathak
Copy link
Member Author

@mangod9 - Do you think we will be doing this in .NET 7?

@mangod9
Copy link
Member

mangod9 commented Jun 6, 2022

yeah we hope to investigate this soon. Might be worth validating again after the concurrent queue fix.

@kouvel
Copy link
Member

kouvel commented Jul 15, 2022

On the Linux Ampere machine, quick tests on JsonPlatform and Json didn't show any significant improvements from decreasing the IO poller thread counts. It still seems like relatively more IO poller threads are necessary on arm64 platforms than on x64 platforms. Probably the heuristics could be more fine-tuned, and I haven't done an exhaustive test.

@kouvel kouvel modified the milestones: 7.0.0, 8.0.0 Jul 15, 2022
@mangod9 mangod9 modified the milestones: 8.0.0, Future Jun 23, 2023
@mangod9 mangod9 modified the milestones: Future, 9.0.0 Nov 30, 2023
@mangod9 mangod9 modified the milestones: 9.0.0, Future Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants