Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64: Environment.ProcessorCount returns wrong value on higher core machine #67180

Closed
kunalspathak opened this issue Mar 26, 2022 · 19 comments · Fixed by #68639
Closed

Arm64: Environment.ProcessorCount returns wrong value on higher core machine #67180

kunalspathak opened this issue Mar 26, 2022 · 19 comments · Fixed by #68639

Comments

@kunalspathak
Copy link
Member

kunalspathak commented Mar 26, 2022

In #45943, we did a breaking change to have Environment.ProcessorCount take processor affinity into account on Windows. It works well on x64 machines, but on arm64 machines with higher CPU count, it returns wrong value.

Although #47427, calls out that the code that relies on Environment.ProcessorCount for parallelism, should update it to scale better because of reduced return value. However, there is performance sensitive code in IOThreadPool that relies on Environment.Processor to decide on the number of "IO completion poller threads" to use in the application.

int processorsPerPoller =
AppContextConfigHelper.GetInt32Config("System.Threading.ThreadPool.ProcessorsPerIOPollerThread", 12, false);
return (Environment.ProcessorCount - 1) / processorsPerPoller + 1;

On higher core Arm64 machine, had ProcessorCount returned correct value, I could see that we would create more IO completion threads to handle more concurrent requests. In an internal application, I can see it boosts performance by 2.25X on higher core Arm64 machines.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Threading untriaged New issue has not been triaged by the area owner labels Mar 26, 2022
@ghost
Copy link

ghost commented Mar 26, 2022

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

In #45943, we did a breaking change to have Environment.ProcessorCount take processor affinity into account on Windows. It works well on x64 machines, but on arm64 machines with higher CPU count, it returns wrong value.

Although #47427, calls out that the code that relies on Environment.ProcessorCount for parallelism, should update it to scale better because of reduced return value. However, there is performance sensitive code in IOThreadPool that relies on Environment.Processor to decide on the number of "IO completion poller threads" to use in the application.

int processorsPerPoller =
AppContextConfigHelper.GetInt32Config("System.Threading.ThreadPool.ProcessorsPerIOPollerThread", 12, false);
return (Environment.ProcessorCount - 1) / processorsPerPoller + 1;

On higher core Arm64 machine, had ProcessorCount returned correct value, I could see that we would create more IO completion threads to handle more concurrent requests. In an internal application, I can see it boosts performance by 2.25X on higher core Arm64 machines.

Author: kunalspathak
Assignees: -
Labels:

area-System.Threading, untriaged

Milestone: -

@kunalspathak
Copy link
Member Author

@kouvel @mangod9

@mangod9
Copy link
Member

mangod9 commented Mar 26, 2022

this looks to be hardware specific -- I couldnt repro on an arm64 VM with 32 cores.

@kunalspathak
Copy link
Member Author

When I set COMPlus_GCCpuGroup=1 and COMPlus_Thread_UseAllCpuGroups=1 , it reports current number of cores.

@AntonLapounov
Copy link
Member

I am helping @kunalspathak to debug this.

@AntonLapounov
Copy link
Member

I can confirm the issue. The program reports the incorrect value of Environment.ProcessorCount every other run, which is certainly weird.

@mangod9
Copy link
Member

mangod9 commented Mar 26, 2022

hmm, interesting. Is this every other run of the process or subsequent calls in the same process? I dont seem to be able to repro it on a 32 core VM fyi.

@AntonLapounov
Copy link
Member

The machine in question has two processor groups of different sizes. Unless you set environment variables to use all CPU groups, Environment.ProcessorCount returns the number of processors in the processor group the OS started the process on. Therefore, you may see different returned values from run to run and that is the expected behavior.

@mangod9
Copy link
Member

mangod9 commented Mar 26, 2022

ok that makes sense. Need to determine why there is an asymmetric config on that hardware.

@AntonLapounov
Copy link
Member

According to documentation, in Windows 11 processes are no longer constrained to a single processor group by default. I guess that means we should consider UseAllCpuGroups enabled by default, which would allow to avoid this issue.

@kunalspathak
Copy link
Member Author

@AntonLapounov - any further update on this?

@AntonLapounov
Copy link
Member

I have started discussion #67308 and an internal email thread with OS folks regarding this. I assume you can set the environment variables as a workaround until we have a fix.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 28, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 2, 2022
@janvorli
Copy link
Member

@AntonLapounov can you please set the milestone of this issue to remove the untriaged label?

@AntonLapounov AntonLapounov removed the untriaged New issue has not been triaged by the area owner label Jun 20, 2022
@AntonLapounov AntonLapounov added this to the 7.0.0 milestone Jun 20, 2022
@mangod9
Copy link
Member

mangod9 commented Jul 19, 2022

@AntonLapounov @kunalspathak guess we dont plan on fixing for 7 right?

@AntonLapounov AntonLapounov modified the milestones: 7.0.0, Future Jul 19, 2022
@AntonLapounov
Copy link
Member

Yes, apps will have to opt-in to make all processor groups available to the runtime.

@kunalspathak
Copy link
Member Author

Do we plan to include the documentation of the environment variable to use all processor groups for Windows 11/Windows server 2022?

@AntonLapounov
Copy link
Member

Those configuration settings were introduced a long time ago and are already documented. If you think the documentation should be improved, you may submit a documentation PR.

@kunalspathak
Copy link
Member Author

If you think the documentation should be improved

They need to be certainly improved because of change in behavior on Win11 and with many-core machines are getting more common. Under what API/category should the documentation go to?

@jkotas
Copy link
Member

jkotas commented Jul 19, 2022

Under what API/category should the documentation go to?

Affected APIs should get a note:

I suspect that we may still end up doing more product changes here given the feedback in #72441. I would wait for that issue to get resolved before starting on documentation edits.

(If you have access to a Win2022 machine with large number of cores, it would be useful to get answer to #72441 (comment) .)

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 2, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 11, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Sep 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants