Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase Timeouts and Worker Memory for Azure SSA #16016

Merged

Conversation

jerryk55
Copy link
Member

@jerryk55 jerryk55 commented Sep 22, 2017

In support of high priority BZ https://bugzilla.redhat.com/show_bug.cgi?id=1488967
we need to increase various timeouts to allow the Azure SSA job to succeed.

  1. Increase the Job timeout specific to Azure SSA similar to how it has been
    done for SCVMM previously.
  2. Increase the SmartProxyWorker MiqQueue msg_timeout value similar to how it has
    been done for both SCVMM and OpenStack previously.
  3. Increase the memory_threshold and restart_interval for all SmartProxyWorker jobs.
    The memory_threshold issue has been seen running SSA on other providers as well as
    Azure so the overall default change here is appropriate.

Other PRs are related to the BZ as well but this PR may be merged in any order as there
are no prerequisites for it.

This PR should be back ported to FINE.

Please note that we should probably refactor the timeout code in items 1 and 2 above to distribute it to the Providers in question rather than have it live in the ManageIQ repo but that is an item for another day.

Links

Steps for Testing/QA [Optional]

Run SSA on various providers including Azure - especially with Managed Disks. along with other required PRs referenced in the BZ above the scan should complete.

@roliveri @jrafanie please review and merge as soon as possible to allow us to get this BZ addressed.

@jerryk55
Copy link
Member Author

@miq-bot add_label fine/yes

In support of high priority BZ https://bugzilla.redhat.com/show_bug.cgi?id=1488967
we need to increase various timeouts to allow the Azure SSA job to succeed.
1) Increase the Job timeout specific to Azure SSA similar to how it has been
   done for SCVMM previously.
2) Increase the SmartProxyWorker MiqQueue msg_timeout value similar to how it has
   been done for both SCVMM and OpenStack previously.
3) Increase the memory_threshold and restart_interval for all SmartProxyWorker jobs.
   The memory_threshold issue has been seen running SSA on other providers as well as
   Azure so the overall default change here is appropriate.
@jerryk55 jerryk55 force-pushed the increase_msg_timeout_for_azure_ssa branch from 191b2a9 to 3b5b283 Compare September 22, 2017 13:20
@miq-bot
Copy link
Member

miq-bot commented Sep 22, 2017

Checked commit jerryk55@3b5b283 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
2 files checked, 0 offenses detected
Everything looks fine. ⭐

@@ -82,6 +82,9 @@ def queue_call(ost)
if target.kind_of?(ManageIQ::Providers::Openstack::CloudManager::Vm) ||
target.kind_of?(ManageIQ::Providers::Openstack::CloudManager::Template)
timeout_adj = 4
elsif target.kind_of?(ManageIQ::Providers::Azure::CloudManager::Vm) ||
target.kind_of?(ManageIQ::Providers::Azure::CloudManager::Template)
timeout_adj = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, we should probably have these classes implement a timeout function so we can just ask the target what it's timeout should be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup I said something to that effect in the description of the PR.

@@ -178,6 +178,9 @@ def timeout_adjustment
if target.kind_of?(ManageIQ::Providers::Microsoft::InfraManager::Vm) ||
target.kind_of?(ManageIQ::Providers::Microsoft::InfraManager::Template)
timeout_adjustment = 4
elsif target.kind_of?(ManageIQ::Providers::Azure::CloudManager::Vm) ||
target.kind_of?(ManageIQ::Providers::Azure::CloudManager::Template)
timeout_adjustment = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below.

@jrafanie
Copy link
Member

@miq-bot add_label smart state

@jrafanie
Copy link
Member

@miq-bot add_label fine/yes

@jrafanie
Copy link
Member

@miq-bot Y U NO WORK

@miq-bot
Copy link
Member

miq-bot commented Sep 22, 2017

😵

@NickLaMuro
Copy link
Member

@jrafanie that is cheating... no logging into the bot account...

@miq-bot
Copy link
Member

miq-bot commented Sep 22, 2017

@jrafanie unrecognized command 'Y', ignoring...

Accepted commands are: add_label, assign, close_issue, move_issue, remove_label, rm_label, set_milestone

@jrafanie
Copy link
Member

I'm fine with the code change. There's no test even though the timeout_adjustment method seems easily tested. 😞

@roliveri please review/merge.

@roliveri roliveri self-assigned this Sep 22, 2017
@roliveri roliveri merged commit a3c9213 into ManageIQ:master Sep 22, 2017
@jrafanie jrafanie added this to the Sprint 70 Ending Oct 2, 2017 milestone Sep 22, 2017
simaishi pushed a commit that referenced this pull request Sep 22, 2017
…e_ssa

Increase Timeouts and Worker Memory for Azure SSA
(cherry picked from commit a3c9213)

https://bugzilla.redhat.com/show_bug.cgi?id=1488967
@simaishi
Copy link
Contributor

Fine backport details:

$ git log -1
commit 8770490a135c6ee0c93872873f3a941f8cfa1198
Author: Richard Oliveri <[email protected]>
Date:   Fri Sep 22 12:09:43 2017 -0400

    Merge pull request #16016 from jerryk55/increase_msg_timeout_for_azure_ssa
    
    Increase Timeouts and Worker Memory for Azure SSA
    (cherry picked from commit a3c9213171c1283eebe243c6355a334c4f21e085)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1488967

d-m-u pushed a commit to d-m-u/manageiq that referenced this pull request Jun 6, 2018
…for_azure_ssa

Increase Timeouts and Worker Memory for Azure SSA
(cherry picked from commit a3c9213)

https://bugzilla.redhat.com/show_bug.cgi?id=1488967
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants