You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS Name: Microsoft Windows Server 2016 Standard
OS Version: 10.0.14393 N/A Build 14393
OS Manufacturer: Microsoft Corporation
OS Configuration: Member Server
OS Build Type: Multiprocessor Free
Registered Owner: Windows User
Registered Organization:
Product ID: 00377-60000-00000-AA934
Original Install Date: 03/09/2019, 10:04:45 AM
System Boot Time: 29/06/2021, 10:22:26 AM
System Manufacturer: VMware, Inc.
System Model: VMware Virtual Platform
System Type: x64-based PC
Processor(s): 6 Processor(s) Installed.
[01]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[02]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[03]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[04]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[05]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[06]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
BIOS Version: Phoenix Technologies LTD 6.00, 12/12/2018
Windows Directory: C:\Windows
System Directory: C:\Windows\system32
Boot Device: \Device\HarddiskVolume1
Total Physical Memory: 65,535 MB
Available Physical Memory: 57,022 MB
Virtual Memory: Max Size: 87,551 MB
Virtual Memory: Available: 76,334 MB
Virtual Memory: In Use: 11,217 MB
Page File Location(s): C:\pagefile.sys
P:\pagefile.sys
Hotfix(s): 13 Hotfix(s) Installed.
[01]: KB3199986
[02]: KB4346087
[03]: KB4485447
[04]: KB4498947
[05]: KB4503537
[06]: KB4509091
[07]: KB4535680
[08]: KB4550994
[09]: KB4562561
[10]: KB4576750
[11]: KB5001078
[12]: KB5001402
[13]: KB5003638
Network Card(s): 1 NIC(s) Installed.
[01]: vmxnet3 Ethernet Adapter
Connection Name: Ethernet0
DHCP Enabled: No
Hyper-V Requirements: A hypervisor has been detected. Features required for Hyper-V will not be displayed.
Issue
We are facing issues running large number of jobs > 60 where the nomad executor processes start consuming CPU and basically stall the system and eventually jobs are killed and tried to reschedule or some eventually fail. If the jobs are around 60, the executor CPU overhead appears every few minutes or so and hits 100% for a couple of minutes (or > 10m sometimes) and then falls down and stabilises.
This appears to be related to the issue #5832 where similar problem occurred on linux, however the fix for the same is not relevant for windows I believe.
This is becoming a major bottle neck for us as we are using the HashiCorp stack - especially Nomad and Consul in production and are also considering moving to the enterprise edition but however this issue has been a significant blocker.
Reproduction steps
Run any job of type service with count in large numbers > 100.
I had also posted details related to this in the forum but however never received any inputs related to it:
Increasing the pidScanInterval as suggested in the #5832 certainly makes the system more stable although it does occasionally spike to 100% for a few seconds to almost a minute intermittently.
The average CPU consumption is around 10-15% when the nomad executor processes aren't hogging the CPU.
Nomad version
Nomad v1.1.2 (60638a0)
Operating system and Environment details
OS Name: Microsoft Windows Server 2016 Standard
OS Version: 10.0.14393 N/A Build 14393
OS Manufacturer: Microsoft Corporation
OS Configuration: Member Server
OS Build Type: Multiprocessor Free
Registered Owner: Windows User
Registered Organization:
Product ID: 00377-60000-00000-AA934
Original Install Date: 03/09/2019, 10:04:45 AM
System Boot Time: 29/06/2021, 10:22:26 AM
System Manufacturer: VMware, Inc.
System Model: VMware Virtual Platform
System Type: x64-based PC
Processor(s): 6 Processor(s) Installed.
[01]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[02]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[03]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[04]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[05]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
[06]: Intel64 Family 6 Model 63 Stepping 0 GenuineIntel ~2694 Mhz
BIOS Version: Phoenix Technologies LTD 6.00, 12/12/2018
Windows Directory: C:\Windows
System Directory: C:\Windows\system32
Boot Device: \Device\HarddiskVolume1
Total Physical Memory: 65,535 MB
Available Physical Memory: 57,022 MB
Virtual Memory: Max Size: 87,551 MB
Virtual Memory: Available: 76,334 MB
Virtual Memory: In Use: 11,217 MB
Page File Location(s): C:\pagefile.sys
P:\pagefile.sys
Hotfix(s): 13 Hotfix(s) Installed.
[01]: KB3199986
[02]: KB4346087
[03]: KB4485447
[04]: KB4498947
[05]: KB4503537
[06]: KB4509091
[07]: KB4535680
[08]: KB4550994
[09]: KB4562561
[10]: KB4576750
[11]: KB5001078
[12]: KB5001402
[13]: KB5003638
Network Card(s): 1 NIC(s) Installed.
[01]: vmxnet3 Ethernet Adapter
Connection Name: Ethernet0
DHCP Enabled: No
Hyper-V Requirements: A hypervisor has been detected. Features required for Hyper-V will not be displayed.
Issue
We are facing issues running large number of jobs > 60 where the nomad executor processes start consuming CPU and basically stall the system and eventually jobs are killed and tried to reschedule or some eventually fail. If the jobs are around 60, the executor CPU overhead appears every few minutes or so and hits 100% for a couple of minutes (or > 10m sometimes) and then falls down and stabilises.
This appears to be related to the issue #5832 where similar problem occurred on linux, however the fix for the same is not relevant for windows I believe.
This is becoming a major bottle neck for us as we are using the HashiCorp stack - especially Nomad and Consul in production and are also considering moving to the enterprise edition but however this issue has been a significant blocker.
Reproduction steps
Run any job of type service with count in large numbers > 100.
I had also posted details related to this in the forum but however never received any inputs related to it:
https://discuss.hashicorp.com/t/nomad-v1-0-1-executor-raw-exec-processes-appear-to-consume-significant-cpu-on-windows/21771
The text was updated successfully, but these errors were encountered: