Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is already running for more than 5 minutes! [WARN] Killing stunned ViPullStatistics #297

Open
iammathh opened this issue May 17, 2022 · 11 comments
Labels

Comments

@iammathh
Copy link

iammathh commented May 17, 2022

Hello, SexiGraf was a great finding.

I was wondering if it's a performance limitation of vCenter APIs or the PowerShell script itself to take more than 5 min to run the ViPullStatistics.ps1 script. My infrastructure has around 350 vSAN hosts, connected in one vCenter. Many clusters, 6.7 and 7.0.

Sometimes it take 9 Min, 7min, 12 min... but never 5 and the data time ended up been no time consistentent. There's anyway to improve speed?

2022-05-17T15:15:03.4381074+00:00 [WARN] ViPullStatistics for ---masked--- is already running for more than 5 minutes!
2022-05-17T15:15:03.4386787+00:00 [WARN] Killing stunned ViPullStatistics for ---masked---

I'm testing the - Highway 17 latest version

Thanks!

@rschitz
Copy link
Member

rschitz commented May 17, 2022

Hi, thanks for reaching out!
350 vSAN hosts in a single vCenter maybe too much but i can find ways to optimize if you send me the log file to plot [at] sexigraf.fr

@rschitz
Copy link
Member

rschitz commented May 30, 2022

@iammathh in case it's related, could you try that code change please? #298

@acederlund
Copy link
Contributor

I'm seeing this issue as well on a fresh installed, I've altered the code for Send-BulkGraphiteMetrics.ps1 with the suggestions in #298, but it seems like the "SmartStatsSummary" for one particular cluster is what's causing the problems:

2022-06-07T10:30:41.7868123+00:00 [INFO] Processing vCenter ip_ip_ip_ip cluster clustername in datacenter datacentername
2022-06-07T10:30:41.7875290+00:00 [INFO] Processing vCenter ip_ip_ip_ip cluster clustername hosts in datacenter datacentername
2022-06-07T10:30:41.8321145+00:00 [INFO] Processing vCenter ip_ip_ip_ip cluster clustername vms in datacenter datacentername
2022-06-07T10:30:41.9369200+00:00 [INFO] Processing vCenter ip_ip_ip_ip cluster clustername datastores in datacenter datacentername
2022-06-07T10:30:41.9384214+00:00 [INFO] Start processing VsanPerfQuery in cluster clustername (v6.7+) ...
2022-06-07T10:30:42.8495180+00:00 [INFO] VsanPerfQueryPerf metrics collected in 0.8938623 sec for vSan Cluster clustername in vCenter ip_ip_ip_ip
2022-06-07T10:30:42.8629893+00:00 [INFO] Start processing SmartStatsSummary in cluster clustername (v6.7+) ...
**********************
PowerShell transcript start
Start time: 20220607103502
**********************
Transcript started, output file is /var/log/sexigraf/ViPullStatistics.log
2022-06-07T10:35:02.1588779+00:00 [INFO] ViPullStatistics v0.9.990
Transcript started, output file is /var/log/sexigraf/VsanDisksPullStatistics.ip.ip.ip.ip.log
Transcript started, output file is /var/log/sexigraf/VsanDisksPullStatistics.log
2022-06-07T10:35:02.1935261+00:00 [INFO] Importing PowerCli and Graphite PowerShell modules ...
2022-06-07T10:35:04.7331891+00:00 [INFO] Looking for another ViPullStatistics for ip.ip.ip.ip ...
2022-06-07T10:35:04.8279988+00:00 [WARN] ViPullStatistics for ip.ip.ip.ip is already running for more than 5 minutes!
2022-06-07T10:35:04.8286823+00:00 [WARN] Killing stunned ViPullStatistics for ip.ip.ip.ip

Is there any way to find out what is causing this?

@acederlund
Copy link
Contributor

I should also add that it will complete in some runs, and these statistics take a very long time to process:

2022-06-07T10:35:38.2709865+00:00 [INFO] Start processing SmartStatsSummary in cluster clustername (v6.7+) ...
2022-06-07T10:39:37.6640096+00:00 [INFO] Processing spaceUsageByObjectType in vSAN cluster clustername (v6.2+) ...

As you can see, nearly four minutes for this.

@rschitz
Copy link
Member

rschitz commented Jun 9, 2022

Let me check if i can parallelize this particular query
In the meantime would you agree to test without the smart metrics collection?
you'd only have to comment this line

$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server
or this one if you have the latest version running
$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server

@rschitz
Copy link
Member

rschitz commented Jun 9, 2022

I should also add that it will complete in some runs, and these statistics take a very long time to process:

2022-06-07T10:35:38.2709865+00:00 [INFO] Start processing SmartStatsSummary in cluster clustername (v6.7+) ...
2022-06-07T10:39:37.6640096+00:00 [INFO] Processing spaceUsageByObjectType in vSAN cluster clustername (v6.2+) ...

As you can see, nearly four minutes for this.

How many hosts in that cluster?

@acederlund
Copy link
Contributor

How many hosts in that cluster?

Only six hosts at the moment!

@acederlund
Copy link
Contributor

Let me check if i can parallelize this particular query In the meantime would you agree to test without the smart metrics collection? you'd only have to comment this line

$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server

or this one if you have the latest version running

$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server

I will try this and get back to you shortly!

@acederlund
Copy link
Contributor

Let me check if i can parallelize this particular query In the meantime would you agree to test without the smart metrics collection? you'd only have to comment this line

$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server

or this one if you have the latest version running

$VsanClusterHealthSystem = Get-VSANView -Id VsanVcClusterHealthSystem-vsan-cluster-health-system -Server $Server

That looks much better:
image

@rschitz
Copy link
Member

rschitz commented Jun 9, 2022

That looks much better: image

That's good news!

@rschitz
Copy link
Member

rschitz commented Jun 9, 2022

How many hosts in that cluster?

Only six hosts at the moment!

That might be one host then, the weird thing is that the feature is supposed to get cached information from vcenter to avoid querying the hosts directly.

@rschitz rschitz added the bug label Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants